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This report summarizes the activities of the Parton Distributions Working Group of the 'QCD and Weak Boson Physics 
workshop' held in preparation for Run II at the Fermilab Tevatron. The main focus of this working group was to investigate the 
different issues associated with the development of quantitative tools to estimate parton distribution functions uncertainties. 
In the conclusion, we introduce a "Manifesto" that describes an optimal method for reporting data. 



INTRODUCTION 

With Run II and its large increase in integrated lu- 
minosity, the Tevatron will enter an era of high pre- 
cision measurements. In this era, parton distribution 
function (PDF) uncertainties will play a major role. 

The basic questions for PDFs at the Tevatron Run 
II are simple and common to all other experiment: 

• What limitations will the PDFs put on physics 
analysis? 

• What information can we gain about the PDFs? 

There are some qualitative tools that exists and can be 
used to try to answer these questions. However, beside 
S. Alekhin's pioneer work [ EJ, quantitative tools that 
attempt to include all sources of uncertainties are not 
available yet. The main focus of this working group has 
therefore been to investigate the different issues asso- 
ciated with the development of those tools, although 
obviously other topics have also been investigated. 

We have divided this summary of activities into in- 
dividual contributions: 

• UNCERTAINTIES OF PARTON DISTRIBU- 
TION FUNCTIONS AND THEIR IMPLICA- 
TION ON PHYSICAL PREDICTIONS. R. Brok 
et al. describe preliminary results from an ef- 
fort to quantify the uncertainties in PDFs and 
the resulting uncertainties in predicted physical 
quantities. The production cross section of the 
W boson is given as a first example. 

• PARTON DISTRIBUTION FUNCTION UN- 
CERTAINTIES. Giele et al. review the status 
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of their effort to extract PDFs from data with a 
quantitative estimate of the uncertainties. 

• EXPERIMENTAL UNCERTAINTIES AND 
THEIR DISTRIBUTIONS IN THE INCLUSIVE 
JET CROSS SECTION. R. Hirosky summarizes 
the current CDF and DO analysis for the inclusive 
jet cross sections. So far the uncertainties have 
been assumed to be Gaussian distributed. He 
investigates what information can be extracted 
about the shape of the uncertainties with the goal 
of being able to provide a way to calculate the 
Likelihood. 

• PARTON DENSITY UNCERTAINTIES AND 
SUSY PARTICLE PRODUCTION. T. Plehn 
and M. Kramer study the current status of PDF's 
uncertainties on SUSY particle mass bounds or 
mass determinations. 

• SOFT-GLUON RESUMMATION AND PDF 
THEORY UNCERTAINTIES. G. Sterman and 
W. Vogelsang discuss the interplay of higher or- 
der corrections and PDF determinations, and the 
possible use of soft-gluon resummation in global 
fits. 

• PARTON DISTRIBUTION FUNCTIONS: EX- 
PERIMENTAL DATA AND THEIR INTER- 
PRETATION. L. de Barbaro review current is- 
sues in the interpretation of experimental data 
and the outlook for future data. 

• HEAVY QUARK PRODUCTION. Olness et al. 
present a status report of a variety of projects 
related to heavy quark production. 
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• PARTON DENSITIES FOR HEAVY QUARKS. 
J. Smith compares different PDFs for heavy 
quarks. 

• CONSTRAINTS ON THE GLUON DEN- 
SITY FROM LEPTON PAIR PRODUCTION. 
E. L. Berger and M. Klasen study the sensitiv- 
ity of the hadroproduction of lepton pairs to the 
gluon density. 

Note that the individual references are at the end of 
the corresponding contribution. The references for the 
introduction and the conclusion arc at the end. 
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Abstract 

We describe preliminary results from an effort to quan- 
tify the uncertainties in parton distribution functions 
and the resulting uncertainties in predicted physical 
quantities. The production cross section of the W bo- 
son is given as a first example. Constraints due to the 
full data sets of the CTEQ global analysis are used 
in this study. Two complementary approaches, based 
on the Hessian and the Lagrange multiplier method 
respectively, are outlined. We discuss issues on ob- 
taining meaningful uncertainty estimates that include 
the effect of correlated experimental systematic uncer- 
tainties and illustrate them with detailed calculations 
using one set of precision DIS data. 



1. Introduction 

Many measurements at the Tevatron rely on parton 
distribution functions (PDFs) for significant portions 
of their data analysis as well as the interpretation of 
their results. For example, in cross section measure- 
ments the acceptance calculation often relies on Monte 
Carlo (MC) estimates of the fraction of unobserved 
events. As another example, the measurement of the 
mass of the W boson depends on PDFs via the mod- 
eling of the production of the vector boson in MC. In 
such cases, uncertainties in the PDFs contribute, by 
necessity, to uncertainties on the measured quantities. 
Critical comparisons between experimental data and 
the underlying theory are often even more dependent 
upon the uncertainties in PDFs. The uncertainties on 
the production cross sections for W and Z bosons, cur- 
rently limited by the uncertainty on the measured lu- 
minosity, are approximately 4%. At this precision, any 
comparison with the theoretical prediction inevitably 
raises the question: How "certain" is the prediction 
itself? 

A recent example of the importance of PDF uncer- 
tainty is the proper interpretation of the measurement 
of the high-£r jet cross-section at the Tevatron. When 
the first CDF measurement was published [ |J, there 
was a great deal of controversy over whether the ob- 
served excess, compared to theory, could be explained 
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by deviations of the PDFs, especially the gluon, from 
the conventionally assumed behavior, or could it be the 
first signal for some new physics [ ^) . 

With the unprecedented precision and reach of many 
of the Run I measurements, understanding the implica- 
tions of uncertainties in the PDFs has become a burn- 
ing issue. During Run II (and later at LHC) this issue 
may strongly affect the uncertainty estimates in preci- 
sion Standard Model studies, such as the all important 
W-mass measurement, as well as the signal and back- 
ground estimates in searches for new physics. 

In principle, it is the uncertainties on physical quan- 
tities due to parton distributions, rather than on the 
PDFs themselves, that is of primary concern. The 
latter are theoretical constructs which depend on the 
renormalization and factorization schemes; and there 
are strong correlations between PDFs of different fla- 
vors and from different values of x, which can compen- 
sate each other in the convolution integrals that relate 
them to physical cross-sections. On the other hand, 
since PDFs are universal, if we can obtain meaning- 
ful estimates of their uncertainties based on analysis 
of existing data, then the results can be applied to all 
processes that are of interest in the future. [ ||, |I| 

One can attempt to assess directly the uncertainty 
on a specific physical prediction due to the full range 
of PDFs allowed by available experimental constraints. 
This approach will provide a more reliable estimate for 
the range of possible predictions for the physical vari- 
able under study, and may be the best course of action 
for ultra-precise measurements such as the mass of the 
W boson or the W production cross-section. However, 
such results are process-specific and therefore the anal- 
ysis must be carried out for each case individually. 

Until recently, the attempts to quantify either the 
uncertainties on the PDFs themselves (via uncertain- 
ties on their functional parameters, for instance) or the 
uncertainty on derived quantities due to variations in 
the PDFs have been rather unsatisfactory. Two com- 
monly used methods are: (1) Comparing the predic- 
tions obtained with different PDF sets, e.g., various 
CTEQ [ §, MRS [ § and GRV [ § sets; (2) Within 
a given global analysis effort, varying individual func- 
tional parameters ad hoc, within limits considered to 
be consistent with the existing data, e.g. [ |8|. Neither 
method provides a systematic, quantitative measure of 
the uncertainties of the PDFs or their predictions. 

As a case in point, Fig. |l] shows how the calcu- 
lated value of the cross section for W boson production 
at the Tevatron varies with a set of historical CTEQ 
PDFs as well as the most recent CTEQ [§ and MRST 
[ ^| sets. Also shown are the most recent measurements 
from D0 and CDFQ. While it is comforting to see that 

2 It is interesting to note that much of the difference between the 
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Figure 1. Predicted cross section for W boson produc- 
tion for various PDFs. 



the predictions have remained within a narrow range, 
the variation observed cannot be characterized as a 
meaningful estimate of the uncertainty: (i) the varia- 
tion with time reflects mostly the changes in experi- 
mental input to, or analysis procedure of, the global 
analyses; and (ii) the perfect agreement between the 
values of the most recent CTEQ5M10and MRS99 sets 
must be fortuitous, since each group has also obtained 
other satisfactory sets which give rise to much larger 
variations of the W cross section. The MRST group, in 
particular has examined the range of this variation by 
setting a variety of parameters to some extreme values 
[ D . These studies are useful but can not be considered 
quantitative or definitive. What is needed are methods 
that explore thoroughly the possible variations of the 
parton distribution functions. 

It is important to recognize all potential sources of 
uncertainty in the determination of PDFs. Focusing 
on some of these, while neglecting significant others, 
may not yield practically useful results. Sources of 
uncertainty are listed below: 

• Statistical uncertainties of the experimental data 
used to determine the PDFs. These vary over a wide 
range among the experiments used in a global analy- 
sis, but are straightforward to treat. 

• Systematic uncertainties within each data set. 



D0 and CDF W cross sections is due to the different values of 
the total pp cross sections used 

3 CTEQ5MI is an updated version of CTEQ5M differing only 
in a slight improvement in the QCD evolution (cf. note added 
in proof of [ p|). The differences are completely insignificant 
for our purposes. Henceforth, we shall refer to them generically 
as CTEQ5M. Both sets can be obtained from the web address 
http: / / cteq.org/. 
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There are typically many sources of experimental sys- 
tematic uncertainty, some of which are highly corre- 
lated. These uncertainties can be treated by standard 
methods of probability theory provided they are pre- 
cisely known, which unfortunately is often not the 
case - either because they may not be randomly dis- 
tributed and/or because their estimation in practice 
involves subjective judgements. 

• Theoretical uncertainties arising from higher- 
order PQCD corrections, resummation corrections 
near the boundaries of phase space, power-law (higher 
twist) and nuclear target corrections, etc. 

• Uncertainties due to the parametrization of the 
non-perturbative PDFs, f a (x,Qo), at some low 
momentum scale Qq. The specific choice of the func- 
tional form used at Qq introduces implicit correla- 
tions between the various x-ranges, which could be 
as important, if not more so, than the experimental 
correlations in the determination of f a (x,Q) for all 
Q. 

Since strict quantitative statistical methods are 
based on idealized assumptions, such as random mea- 
surement uncertainties, an important trade-off must be 
faced in devising a strategy for the analysis of PDF 
uncertainties. If emphasis is put on the "rigor" of the 
statistical method, then many important experiments 
cannot be included in the analysis, cither because the 
published errors appear to fail strict statistical tests 
or because data from different experiments appear to 
be mutually exclusive in the parton distribution pa- 
rameter space [ ||. If priority is placed on using the 
maximal experimental constraints from available data, 
then standard statistical methods may not apply, but 
must be supplemented by physical considerations, tak- 
ing into account experimental and theoretical limita- 
tions. We choose the latter tack, pursuing the deter- 
mination of the uncertainties in the context of the cur- 
rent CTEQ global analysis. In particular, we include 
the same body of the world's data as constraints in our 
uncertainty study as that used in the CTEQ5 analy- 
sis; and adopt the "best fit" - the CTEQ5M1 set - as 
the base set around which the uncertainty studies are 
performed. In practice, there are unavoidable choices 
(and compromises) that must be made in the analysis. 
(Similar subjective judgements often are also necessary 
in estimating certain systematic errors in experimental 
analyses.) The most important consideration is that 
quantitative results must remain robust with respect 
to reasonable variations in these choices. 

In this Report we describe preliminary results ob- 
tained by our group using the two approaches men- 
tioned earlier. In Section 3 we focus on the error ma- 
trix, which characterizes the general uncertainties of 



the non-perturbative PDF parameters. In Sections 4 
and 5 we study specifically the production cross sec- 
tion <Tw for bosons at the Tevatron, to estimate 
the uncertainty of the prediction of aw due to PDF 
uncertainty. We start in Section 2 with a review of 
some aspects of the CTEQ global analysis on which 
this study is based. 

2. Elements of the Base Global Analysis 

Since our strategy is based on using the existing 
framework of the CTEQ global analysis, it is useful 
to review some of its features pertinent to the current 
study [§. 

Data selection: 

Table [j] shows the experimental data sets included in 
the CTEQ5 global analysis, and in the current study. 
For neutral current DIS data only the most accurate 
proton and deuteron target measurements are kept, 
since they are the "cleanest" and they are already ex- 
tremely extensive. For charged current (neutrino) DIS 
data, the significant ones all involve a heavy (Fe) tar- 
get. Since these data are crucial for the determina- 
tion of the normalization of the gluon distribution (in- 
directly via the momentum sum rule), and for quark 
flavor differentiation (in conjunction with the neutral 
current data) , they play an important role in any com- 
prehensive global analysis. For this purpose, a heavy- 
target correction is applied to the data, based on mea- 
sured ratios for heavy-to-light targets from NMC and 
other experiments. Direct photon production data are 
not included because of serious theoretical uncertain- 
ties, as well as possible inconsistencies between exist- 
ing experiments. Cf. [ p| and The combination of 
neutral and charged DIS, lepton-pair production, lep- 
ton charge asymmetry, and inclusive large-pr jet pro- 
duction processes provides a fairly tightly constrained 
system for the global analysis of PDFs. In total, there 
are ~1300 data points which meet the minimum mo- 
mentum scale cuts which must be imposed to ensure 
that PQCD applies. The fractional uncertainties on 
these points are distributed roughly like dF/F over 
the range F = 0.003 - 0.4. 

Parametrization: 

The non-perturbative parton distribution functions 
fa{x,Q) at a low momentum scale Q — Qq are 
parametrized by a set of functions of x, corresponding 
to the various flavors a. For this analysis, Qq is taken to 
be 1 GeV. The specific functional forms and the choice 
of Qq are not important, as long as the parametriza- 
tion is general enough to accommodate the behavior of 
the true (but unknown) non-perturbative PDFs. The 



4 





Experiment 


A/Tpfl^nrfl nip 


Nj 1 

1 y data 


Dlb 


BCDMSf id] 


^2 if' ^2 D 


on A 

324 




NMC [ 11| 


^2 ff 1 r 2 D 


O A A 

24U 




HI [ 




b 2 H 


1 f I 




ZEUS[|13| 


F e 


-LOU 




CCFR [ [3 


FX X FX „ 

1 2 Fe' x - 1 3 Fe 


174 


Drcll-Yan 


E605[ 1ST 


sda j d^frdy 


119 




E866 [ m 


a(pd)/2cr(pp) 


11 




NA-51[ 17 1 


Ady 


1 


W-prod. 


CDF [ 


lcf 


Lepton asym. 


11 


Incl. Jet 


CDF [ 




dcr/dEt 


33 




D0[ S 


4 


da/dE t 


24 



Table 1 

List of processes and experiments used in the 
CTEQ5M Global analysis. The total number of data 
points is 1295. 



CTEQ analysis adopts the functional form 
a x ai (l - x) a2 (l + a 3 .T a4 ). 

for most quark flavors as well as for the gluon.[] After 
momentum and quark number sum rules are enforced, 
there are 18 free parameters left over, hereafter referred 
to as "shape parameters" {a{\. The PDFs at Q > Qo 
are determined from f a {x, Qo) by evolution equations 
from the renormalization group. 

Fitting: 

The values of {a^} are determined by fitting the 
global experimental data to the theoretical expressions 
which depend on these parameters. The fitting is done 
by minimizing a global "chi-square" function, XgiobaT 
The quotation mark indicates that this function serves 
as a figure of merit of the quality of the global fit; it 
does not necessarily have the full significance associ- 
ated with rigorous statistical analysis, for reasons to 
be discussed extensively throughout the rest of this re- 
port. In practice, this function is defined as: 

Xglobal = J2J2 Wn [( N "- d ni - *ni) Mi] 2 
n i 

+ £[(i-Ay/^] 2 (i) 

where d n i, cr^i, and t n i denote the data, measurement 
uncertainty, and theoretical value (dependent on {a^}) 
for the i th data point in the n th experiment. The 
second term allows the absolute normalization (N n ) 

4 An exception is that recent data from E866 seem to require 
the ratio d/u to take a more unconventional functional form. 



for each experiment to vary, constrained by the pub- 
lished normalization uncertainty (ct„ ) . The w n factors 
are weights applied to some critical experiments with 
very few data points, which are known (from physics 
considerations) to provide useful constraints on cer- 
tain unique features of PDFs not afforded by other ex- 
periments. Experience shows that without some judi- 
ciously chosen weights, these experimental data points 
will have no influence in the global fitting process. The 
use of these weighing factors, to enable the relevant 
unique constraints, amounts to imposing certain prior 
probability (based on physics knowledge) to the statis- 
tical analysis. 

In the above form, Xgiobai includes for each data 
point the random statistical uncertainties and the com- 
bined systematic uncertainties in uncorrelated form, as 
presented by most experiments in the published pa- 
pers. These two uncertainties are combined in quadra- 
ture to form (j^ in Eq. [|. Detailed point to point corre- 
lated systematic uncertainties are not available in the 
literature in general; however, in some cases, they can 
be obtained from the experimental groups. For global 
fitting, uniformity in procedure with respect to all ex- 
periments favors the usual practice of merging them 
into the uncorrelated uncertainties. For the study of 
PDF uncertainties, we shall discuss this issue in more 
detail in Section [|. 

Goodness-of-fit for CTEQ5M: 

Without going into details, Fig. ||| gives an overview 
of how well CTEQ5m fits the total data set. The 
graph is a histogram of the variable x = (d — t)/a 
where d is a data value, a the uncertainty of that mea- 
surement (statistical and systematic combined), and 
t the theoretical value for CTEQ5m. The curve in 
Fig. |^ has no adjustable parameters; it is the Gaussian 
with width 1 normalized to the total number of data 
points (1295). Over the entire data set, the theory fits 
the data within the assigned uncertainties , indicat- 
ing that those uncertainties are numerically consistent 
with the actual measurement fluctuations. Similar his- 
tograms for the individual experiments reveal various 
deviations from the theory, but globally the data have 
a reasonable Gaussian distribution around CTEQ5M. 

3. Uncertainties on PDF parameters: The Er- 
ror Matrix 

We now describe results from an investigation of the 
behavior of the Xgi b a i function at its minimum, using 
the standard error matrix approach [ . This allows 
us to determine which combinations of parameters are 
contributing the most to the uncertainty. 

At the minimum of Xg lobal , the first derivatives with 
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Figure 2. Histogram of the (measurement 
for all data points in the CTEQ5m fit. 



theory) 



respect to the {a^} are zero; so near the minimum, 
Xgiobai can De approximated by 



Xgiobai = Xo + 2 FiiViVj 



(2) 



aoi is the displacement from the min- 
is the Hessian, the matrix of second 



where yn = <ij 
imum, and F^ 
derivatives. It is natural to define a new set of coor- 
dinates using the complete orthonormal set of eigen- 
vectors of the symmetric matrix F^ as basis vectors. 
These vectors can be ordered by their eigenvalues ej. 
Each eigenvalue is a quantitative measure of the uncer- 
tainties in the shape parameters {a^} for displacements 
in parameter space in the direction of the correspond- 
ing eigenvector. The quantity li = 1/y/el is the dis- 
tance in the 18 dimensional parameter space, in the 
direction of eigenvector i, that makes a unit increase 
in Xgiobai- If th c on ly measurement uncertainty were 
uncorrelated gaussian uncertainties, then li would be 
one standard deviation from the best fit in the direc- 
tion of the eigenvector. The inverse of the Hessian is 
the error matrix. 

Because the real uncertainties, for the wide variety 
of experiments included, are far more complicated than 
assumed in the ideal situation, the quantitative mea- 
sure of a given increase in Xgiobai carries little true sta- 
tistical meaning. However, qualitatively, the Hessian 
gives an analytic picture of Xgiobai near its minimum in 
{a,} space, and hence allows us to identify the partic- 
ular degrees of freedom that need further experimental 
input in future global analyses. 

From calculations of the Hessian we find that the 
eigenvalues vary over a wide range. Figure || shows 
a graph of the eigenvalues of F^, on a logarithmic 
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Figure 3. Plot of the eigenvalues of the Hessian. The 
vertical axis is £j = 1/ ' *Je~l. 



scale. The vertical axis is 



1/ 



the distance 



of a "standard deviation" along the i eigenvector. 
These distances range over 3 orders of magnitude. Large 
eigenvalues of Fij correspond to "steep directions" of 
Xgiobai- The corresponding eigenvectors are combina- 
tions of shape parameters that are well determined by 
current data. For example, parameters that govern 
the valence u and d quarks at moderate x are sharply 
constrained by DIS data. Small eigenvalues of Fij cor- 
respond to "flat directions" of Xgiobai* ^ n ^ ne directions 
of these eigenvectors, Xgiobai changes little over large 
distances in {a^} space. For example, parameters that 
govern the large-x behavior of the gluon distribution, 
or differences between sea quarks, properties of the nu- 
cleon that are not accurately determined by current 
data, contribute to the flat directions. The existence 
of flat directions is inevitable in global fitting, because 
as the data improve it only makes sense to maintain 
enough flexibility for f a (%,Qo) to fit the available ex- 
perimental constraints. 

Because the eigenvalues of the Hessian have a large 
range of values, efficient calculation of F^j requires an 
adaptive algorithm. In principle F\j is the matrix of 
second derivatives at the minimum of Xgiobai' which 
could be calculated from very small finite differences. 
In practice, small computational errors in the evalua- 
tion of Xgiobai preclude the use of a very small step size. 
Coarse grained finite differences yield a more accurate 
calculation of the second derivatives. But because the 
variation of Xgiobai var ies markedly in different direc- 
tions, it is important to use a grid in {a^} space with 
small steps in steep directions and large steps in flat 
directions. This grid is generated by an iterative pro- 
cedure, in which Fij converges to a good estimate of 
the second derivatives. 

From calculations of Fij we find that the minimum 
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Figure 4. Value of x 2 along the six eigenvectors with 
the largest eigenvalues. 



parameter space. Figures ^ and || show the behavior 
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Figure 5. Value of x 2 along the 12 eigenvectors with 
the smallest eigenvalues. 



of Xgiobai near the minimum along each of the 18 eigen- 
vectors. X^iobai i s plotted on the vertical axis, and the 
variable on the horizontal axis is the distance in {cij} 
space in the direction of the eigenvector, in units of 
•^i = l/ y/si- There is some nonlinearity, but it is small 
enough that the Hessian can be used as an analytic 
model of the functional dependence of Xgiobai on the 
shape parameters. 

In a future paper we will provide details on the un- 
certainties of the original shape parameters {a^}. But 
it should be remembered that these parameters spec- 
ify the PDFs at the low Q scale, and applications of 
PDFs to Tevatron experiments use PDFs at a high Q 
scale. The evolution equations determine f(x,Q) from 



f(x,Qo), so the functional form at Q depends on the 
{cii} in a complicated way. 



Uncertainty on aw 
Method 



the Lagrange Multiplier 



In this Section, we determine the variation of Xgiobai 
as a function of a single measurable quantity. We use 
the production cross section for W bosons {aw) as 
an archetype example. The same method can be ap- 
plied to any other physical observable of interest, for 
instance the Higgs production cross section, or to cer- 
tain measured differential distributions. The aim is to 
quantify the uncertainty on that physical observable 
due to uncertainties of the PDFs integrated over the 
entire PDF parameter space. 

Again, we use the standard CTEQ5 analysis tools 
and results [||] as the starting point. The "best fit" is 
the CTEQ5M1 set. A natural way to find the limits of 
a physical quantity X, such as aw at %fs = 1.8 TeV, is 
to take X as one of the search parameters in the global 
fit and study the dependence of Xgiobai f° r the 15 base 
experimental data sets on X . 

Conceptually, we can think of the function Xgiobai 
that is minimized in the fit as a function of 
Oi, • • • , 017, X instead of Oi, • • • , ais- This idea could 
be implemented directly in principle, but a more con- 
venient way to do the same thing in practice is through 
Lagrange's method of undetermined multipliers. One 
minimizes, with respect to the {a^}, the quantity 



F(X) = xLbai + AX (ai 



,ai8) 



(3) 



for a fixed value of A, the Lagrange multiplier. By min- 
imizing F(A) for many values of A, we map out Xgiobai 
as a function of X. The minimum of F for a given value 
of A is the best fit to the data for the corresponding 
value of A, i.e., evaluated at the minimum. 

Figure ^ shows Xgiobai f° r the ^ base experimen- 
tal data sets as a function of aw at the Tevatron. 
The horizontal axis is aw times the branching ratio 
for W — > leptons, in nb. The CTEQ5m prediction is 
aw ■ BR\ cp = 2.374 nb. The vertical dashed lines are 
±3% and ±5% deviations from the CTEQ5m predic- 
tion. 

The two parabolas associated with points in Fig. || 
correspond to different treatments of the normalization 
factor N n in Eq. 0. The dots (•) are variable norm fits, 
in which N n is allowed to float, taking into account the 
experimental normalization uncertainties, and F{\) is 
minimized with respect to N n . The justification for 
this procedure is that overall normalization is a com- 
mon systematic uncertainty. The boxes (□) are fixed 
norm fits, in which all N n are held fixed at their values 
for the global minimum (CTEQ5m). These two proce- 
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W-production X-sec. at the Tevatron 




°W * BR lep 



Figure 6. % 2 of the base experimental data sets ver- 
sus aw ■ BRi ep , the W production cross-section at the 
Tevatron times lepton branching ratio, in nb. 



dures represent extremes in the treatment of normal- 
ization uncertainty. The parabolas associated with »'s 
and D's are just least-square fits to the points. 

The other curve in Fig. ^ was calculated using the 
Hessian method. The Hessian Fy is the matrix of sec- 
ond derivatives of Xgiobai w ^ respect to the shape 
parameters {a^}. The derivatives (first and second) of 
aw may also be calculated by finite differences. Using 
the resultant quadratic approximations for X?iobai( a ) 
and aw (a), one may minimize Xgiobai with aw fixed. 
Since this calculation keeps the normalization factors 
constant, it should be compared with the fixed norm 
fits from the Lagrange multiplier method. The fact 
that the Hessian and Lagrange multiplier methods 
yield similar results lends support to both approaches; 
the small difference between them indicates that the 
quadratic functional approximations for Xgiobai an d 
aw are only approximations. 

For the quantitative analysis of uncertainties, the 
important question is: How large an increase in Xgiobai 
should be taken to define the likely range of uncertainty 
in XI There is an elementary statistical theorem that 
states that Ax 2 = f in a constrained fit corresponds 
to 1 standard deviation of the constrained quantity X. 
However, the theorem relies on the assumption that the 
uncertainties are gaussian, uncorrelated, and correctly 
estimated in magnitude. Because these conditions do 
not hold for the full data set (of ~ 1300 points from 15 
different experiments), this theorem cannot be naively 
applied quantitatively.^] Indeed, it can be shown that, 

5 It has been shown by Giele et.al. [W, that, taken literally, only 



if the measurement uncertainties are correlated, and 
the correlation is not properly taken into account in 
the definition of Xgiobai' then a standard deviation may 
vary over the entire range from Ax 2 = 1 to Ax 2 = N 
(the total number of data points - ~ 1300 in our case). 

5. Statistical Analysis with Systematic Uncer- 
tainties 

Fig. U shows how the fitting function Xgiobai increases 
from its minimum value, at the best global fit, as the 
cross-section aw for W production is forced away from 
the prediction of the global fit. The next step in our 
analysis of PDF uncertainty is to use that information, 
or some other analysis, to estimate the uncertainty in 
aw ■ In ideal circumstances we could say that a certain 
increase of Xgiobai fr° m the minimum value, call it Ax 2 , 
would correspond to a standard deviation of the global 
measurement uncertainty. Then a horizontal line on 
Fig. ^| at Xmin + ^X 2 would indicate the probable range 
of aw, by the intersection with the parabola of Xgiobai 
versus aw- 

However, such a simple estimate of the uncertainty 
of aw is not possible, because the fitting function 
Xgiobai does not include the correlations between sys- 
tematic uncertainties. The uncertainty a^ in the defi- 
nition ([!]) of Xgiobai combines in quadrature the statis- 
tical and systematic uncertainties for each data point; 
that is, it treats the systematic uncertainties as uncor- 
related. The standard theorems of statistics for Gaus- 
sian probability distributions of random uncertainties 
do not apply to x g i obal- 

Instead of using Xgiobai to estimate confidence levels 
on aw, we believe the best approach is to carry out a 
thorough statistical analysis, including the correlations 
of systematic uncertainties, on individual experiments 
used in the global fit for which detailed information is 
available. We will describe here such an analysis for 
the measurements of F2(x,Q) by the HI experiment [ 
at HERA, as a case study. In a future paper, we 
will present similar calculations for other experiments. 

The HI experiment has provided a detailed table 
of measurement uncertainties - statistical and system- 
atic - for their measurements of ^(x, Q). \ jl2]l The 
CTEQ program uses 172 data points from HI (requir- 
ing the cut Q 2 > 5GeV 2 ). For each measurement dj 
(where j = 1 . . . 172) there is a statistical uncertainty 
aoj , an uncorrelated systematic uncertainty a\j , and a 
set of 4 correlated systematic uncertainties ajk where 
fc = 1 ... 4. (In fact there are 8 correlated uncertainties 
listed in the HI table. These correspond to 4 pairs. 
Each pair consists of one standard deviation in the 



one or two selected experiments satisfy the standard statistical 
tests. 
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Lagrange 


aw ■ B 


X7172 


probability 


multiplier 


in nb 






3000 


2.294 


1.0847 


0.212 


2000 


2.321 


1.0048 


0.468 


1000 


2.356 


0.9676 


0.605 





2.374 


0.9805 


0.558 


-1000 


2.407 


1.0416 


0.339 


-2000 


2.431 


1.0949 


0.187 


-3000 


2.450 


1.1463 


0.092 



Table 2 

Comparison of HI data to the PDF fits with con- 
strained values of aw- 



positive sense, and one standard deviation in the neg- 
ative sense, of some experimental parameter. For this 
first analysis, we have approximated each pair of un- 
certainties by a single, symmetric combination, equal 
in magnitude to the average magnitude of the pair.) 

To judge the uncertainty of aw, as constrained by 
the HI data, we will compare the HI data to the global 
fits in Fig. |(| The comparison is based on the true, 
statistical x 2 , including the correlated uncertainties, 
which is given by 



x 2 = 



E 



(dj-tj) 



Di 



(4) 



fcfc' 



The index j labels the data points and runs from 1 to 
172. The indices k and kl label the source of systematic 
uncertainty and run from 1 to 4. The combined uncor- 
rected uncertainty aj is 

is 



in (Q) comes from the correlated uncertainties. Bf. 
the vector 



E 



(dj - tj)a jk 



and Akk' is the matrix 



i-kk' 



Ikk' 



a jk a jk' 

j 



(5) 



(6) 



Assuming the published uncertainties <7oj, &ij and 
a,jk accurately reflect the measurement fluctuations, x 2 
would obey a chi-square distribution if the measure- 
ments were repeated many times. Therefore the chi- 
square distribution with 172 degrees of freedom pro- 
vides a basis for calculating confidence levels for the 
global fits in Fig. [| 

Table |H shows \ 2 f° r the HI data compared to seven 
of the PDF fits in Fig. ||. The center row of the Table is 



the global best fit - CTEQ5m. The other rows are fits 
obtained by the Lagrange multiplier method for dif- 
ferent values of the Lagrange multiplier. The best fit 
to the HI data, i.e., the smallest x 2 , is not CTEQ5m 
(the best global fit) but rather the fit with Lagrange 
multiplier 1000 for which aw is 0.8% smaller than the 
prediction of CTEQ5m. Forcing the W cross section 
values away from the prediction of CTEQ5m causes an 
increase in x 2 for the DIS data. At ^fs = 1.8 TeV, W 
production is mainly from qq — * W + W~ with moder- 
ate values of x for q and q, i.e., values in the range 
of DIS experiments. Forcing aw higher (or lower) re- 
quires a higher (or lower) valence quark density in the 
proton, in conflict with the DIS data, so x 2 increases. 

The final column in Table ||, labeled "probability" , 
is computed from the chi-square distribution with 172 
degrees of freedom. This quantity is the probability 
for x 2 to be greater than the value calculated from 
the existing data, if the HI measurements were to be 
repeated. So, for example, the fit with Lagrange mul- 
tiplier —3000, which corresponds to aw being 3.2% 
larger than the CTEQ5m prediction, has probability 
0.092. In other words, if the HI measurements could 
be repeated many times, in only 9.2% of trials would 
X 2 be greater than or equal to the value that has been 
obtained with the existing data. This probability rep- 
resents a confidence level for the value of aw that was 
forced on the PDF by setting the Lagrange multiplier 
equal to -3000. At the 9.2% confidence level we can 
say that aw • BR\ CV is less than 2.450 nb, based on the 
HI data. Similarly, at the 21.2% confidence level we 
can say that aw • BR\ cp is greater than 2.294 nb. 




0.95 



2.25 2.3 2.3.5 2.4 2.45 2.5 



Figure 7. x 2 /N of the HI data, including error corre- 
lations, compared to PDFs obtained by the Lagrange 
multiplier method for constrained values of aw- 



Fig. is a graph of x 2 /A" for the HI data compared 
to the PDF fits in Table |[ This figure may be com- 



pared to Fig. |. The CTEQ5 prediction of the W pro- 
duction cross-section is shown as an arrow, and the 
vertical dashed lines are ±3% away from the CTEQ5m 
prediction. The horizontal dashed line is the 68% con- 
fidence level on x 2 /N for N = 172 degrees of freedom. 
The comparison with HI data alone indicates that the 
uncertainty on aw is ~ 3%. 

There is much more to say about x 2 and confidence 
levels. In a future paper we will discuss statistical cal- 
culations for other experiments in the global data set. 
The HI experiment is a good case, because for HI we 
have detailed information about the correlated uncer- 
tainties. But it may be somewhat fortuitous that the 
X 2 per data point for CTEQ5m is so close to 1 for 
the HI data set. In cases where x 2 /N is not close 
to 1, which can easily happen if the estimated sys- 
tematic uncertainties are not textbook-like, we must 
supply further arguments about confidence levels. For 
experiments with many data points, like 172 for HI, 
the chi-square distribution is very narrow, so a small 
inaccuracy in the estimate of <jj may translate to a 
large uncertainty in the calculation of confidence levels 
based on the absolute value of x 2 ■ Because the estima- 
tion of experimental uncertainties introduces some un- 
certainty in the value of x 2 , it is not really the absolute 
value of x 2 that is important, but rather the relative 
value compared to the value at the global minimum. 
Therefore, we might study ratios of x 2 's to interpret 
the variation of x 2 with aw ■ 



CTEQ5 global analysis. The same methods can be 
applied using other parton distributions as the starting 
point, or using a different parametrization of the non- 
perturbative PDFs. We have indeed tried a variety of 
such alternatives. The results are all similar to those 
presented above. The robustness of these results lends 
confidence to the general conclusions. 

The Hessian, or error matrix method reveals the un- 
certainties of the shape parameters used in the func- 
tional parametrization. The behavior of Xgiobai m the 
neighborhood of the minimum is well described by the 
Hessian if the minimum is quadratic. 

The Lagrange multiplier method produces con- 
strained fits, i.e., the best fits to the global data set 
for specified values of some observable. The increase 
of Xgiobai' as the observable is forced away from the 
predicted value, indicates how well the current data 
on PDFs determines the observable. 

The constrained fits generated by the Lagrange mul- 
tiplier method may be compared to data from individ- 
ual experiments, taking into account the uncertainties 
in the data, to estimate confidence levels for the con- 
strained variable. For example, we estimate that the 
uncertainty of aw attributable to PDFs is ±3%. 

Further work is needed to apply these methods 
to other measurements, such as the W mass or the 
forward-backward asymmetry of W production in pp 
collisions. Such work will be important in the era of 
high precision experiments. 



6. Conclusions 

It has been widely recognized by the HEP com- 
munity, and it has been emphasized at this work- 
shop, that PDF phenomenology must progress from 
the past practice of periodic updating of representa- 
tive PDF sets to a systematic effort to map out the 
uncertainties, both on the PDFs themselves and on 
physical observables derived from them. For the anal- 
ysis of PDF uncertainties, we have only addressed the 
issues related to the treatment of experimental un- 
certainties. Equally important for the ultimate goal, 
one must come to grips with uncertainties associated 
with theoretical approximations and phenomenological 
parametrizations. Both of these sources of uncertain- 
ties induce highly correlated uncertainties, and they 
can be numerically more important than experimental 
uncertainties in some cases. Only a balanced approach 
is likely to produce truly useful results. Thus, great 
deal of work lies ahead. 

This report described first results from two methods 
for quantifying the uncertainty of parton distribution 
functions associated with experimental uncertainties. 
The specific work is carried out as extensions of the 
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Abstract 

We review the status of our effort to extract parton 
distribution functions from data with a quantitative 
estimate of the uncertainties. 

1. Introduction 

The goal of our work is to extract parton distribution 
functions (PDF) from data with a quantitative estima- 
tion of the uncertainties. There are some qualitative 
tools that exist to estimate the uncertainties, see e.g. 
Ref. [[I) . These tools are clearly not adequate when the 
PDF uncertainties become important. One crucial ex- 
ample of a measurement that will need a quantitative 
assessment of the PDF uncertainty is the planned high 
precision measurement of the mass of the W- vector bo- 
son at the Tevatron. Clearly, quantitative tools along 
the line of S. Alekhin's pionner work [ | are needed. 

The method we have developed in Ref. [ |i| is flexi- 
ble and can accommodate non-Gaussian distributions 
for the uncertainties associated with the data and the 
fitted parameters as well as all their correlations. New 
data can be added in the fit without having to redo the 
whole fit. Experimenters can therefore include their 
own data into the fit during the analysis phase, as long 
as correlation with older data can be neglected. Within 
this method it is trivial to propagate the PDF uncer- 
tainties to new observables, there is for example no 
need to calculate the derivative of the observable with 
respect to the different PDF parameters. The method 
also provides tools to assess the goodness of the fit and 
the compatibility of new data with current fit. The 
computer code has to be fast as there is a large num- 
ber of choices in the inputs that need to be tested. 

It is clear that some of the uncertainties are difficult 
to quantify and It might not be possible to quantify all 
of them. All the plots presented here are for illustration 
of the method only, our results are preliminary. At the 
moment we are not including all the sources of uncer- 
tainties and our results should therefore be considered 
as lower limits on the PDF uncertainties. Note that 
all the techniques we use can be found in books and 
papers on statistics [|J and/or in Numerical Recipes [ 

ll- 



ii 



2. Outline of the Method 

We only give a brief overview of the method in this 
section. More details are available in Ref. [ pf. Our 
method follows the Bayesian methodology FT Once 
a set of core experiments is selected, a large num- 
ber of uniformly distributed sets of parameters A = 
Ai, A2, ■ • ■ , Ajv par (each set corresponds to one PDF) 
can be generated and the probability density of the set 
P(X) calculated from the likelihood (the probability) 
that the predictions based on A describe the data, see 
Ref. [ Q and next section. 

Knowing P(A), then for any observable x (or any 
quantity that depends on A) the probability density, 
Pix) can be evaluated, and using a Monte Carlo inte- 
gration, the average value and the standard deviation 
of x can be calculated with the standard expressions: 



/ ( n dx >j x ( x ) p w 

J ( fl dX t \ (x(X)-^ x ) 2 P(X). 



(7) 



If P(x) is Gaussian distributed, then the standard 
deviation is a sufficient measure of the PDF uncer- 
tainties. If P{x) is not Gaussian distributed, then one 
should refer to the distribution itself and not try to 
"summarize" it by a single number, all the information 
is in the distribution itself. The uncertainties due to 
the Monte Carlo can also be calculated with standard 
technique. 

The above is correct but computationally inefficient, 
instead we use a Metropolis algorithm, see Ref. [|], to 
generate N p df unit-weighted PDFs distributed accord- 
ing to -P(A). With this set of PDFs, the expressions in 
Eq. ^ become: 



f',r 



N, 



-$>(A,) 

pdf 



Npdf 



_J> (A,) -,<.)' 



(8) 



This is equivalent to importance sampling in Monte 
Carlo integration techniques. It is very efficient be- 
cause the number of PDFs needed to reach a given 
level of accuracy in the evaluation of the integrals is 
much smaller than when using a set of PDFs uniformly 



6 we also planto present results within the "classical frequentist" 
framework [ fel 



distributed. Given the unit-weighted set of PDFs, a 
new experiment can be added to the fit by assigning 
a weight (a new probability) to each of the PDFs, us- 
ing Bayes' theorem. The above summations become 
weighted. There is no need to redo the whole fit if 
there is no correlation between the old and new data. 
If we know how to calculate P(X) properly, the only un- 
certainty in the method comes from the Monte-Carlo 
integrations. 

3. Calculation of P(X) 

Given a set of experimental points {x e } — 
x\ , . . . , x e N the probability of a set of PDF is in 
fact the conditional probability of {A} given that {x e } 
has been measured, this conditional probability can be 
calculated using Bayes theorem: 



P(A)=P(A|x e ) = ^^P mit (A), 



(9) 



where, as already mentioned, the prior distribution 
of the parameters, -P™t(A), has been assumed to be 
uniform. A prior sensitivity should be performed. 
P(x e \X) is the likelihood, the probability to observe 
the data given that the theory is fixed by the set of 
{A} . P{x e ) is the probability density of the data (in- 
tegrated over the PDFs) and act as a normalization 
coefficient in Eq. |9[ 

If all the uncertainties are Gaussian distributed, then 
it is well known that: 



P(x e \X)^e-^, 

where \ 2 1S the usual chi-square: 



X 2 (A) = H ~ 4(A)) Mir (4 " 4(A)) 

k,l 



(10) 



(11) 



x l k {X) are the theory prediction for the experimen- 
tal observables calculated with the parameters {A} . 
The matrix M tot is the inverse of the total covariance 
matrix. 

When the uncertainties are not Gaussian dis- 
tributed, the result is not as well known. We first 
present two simple examples to illustrate how the like- 
lihood should be calculate and then give a generaliza- 
tion. 

3.1. The simplest example 

We first consider the simplest example to setup the 
notation, one experimental point with a statistical un- 
certainty: 



A*) 



uA, 



(12) 
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where u is a random variable that has it own dis- 
tribution, f(u) (assumed to be Gaussian in this case). 
By convention, we take the average of u equal to and 
its standard deviation equal to 1. A gives the size of 
the statistical uncertainty. For each experimental mea- 
surement there is a different value of u and x e . The 
probability to find x e in an element of length dx e given 
that the theory is fixed by {A} is equal to the proba- 
bility to find u in a corresponding element of length 
dift 



P(x e \X)dx e = f(u)du. 



(13) 



The variable u and the Jacobian for the change of 
variable from u to x e can be extracted from Eq. n2t 



x*(\) 



A 



such that: 



P(* e |A) = 



du 



dx e 



1 

A 



(14) 



'(A)-3 
A 



A 

1 (x f -x e ) 2 

e 2A 2 



2ttA 



This is the expected result. 



(15) 



3.2. A simple example 

We now consider the case of one experimental point 
with a statistical and a systematic uncertainty: 



x\X) 



ttiAx + u 2 A 2 



(16) 



Ai and A2 give the size of the uncertainties. u\ and 
u 2 have their own distribution f (u±) and / 2 (it2) and 
we use the same convention for their average and stan- 
dard deviation as for u in the first example. This time 
for each experimental measurement, there is an infinite 
number of sets of Ui,u 2 that correspond to it, because 
there is only one equation that relate x , x e and U\ and 
u 2 . The probability to find x e in an element of length 
dx e given that the theory is fixed by {A} is here equal 
to the probability to find u\ and u 2 in a corresponding 
element of area du\ du 2l with an integration over one 
of the two variables: 



P{x e \\)dx e = du 1 / du 2 f 1 (u 1 )f(u 2 ) 



(17) 



7 the repetition of the experiment will only be distributed ac- 
cording to u around the true nature value of x l . However we are 
trying to calculate the likelihood, the conditional probability of 
the data given that the true nature value of x l is given by the 
value of the {A} under study 



We choose to integrate over u 2 . u\ and the Jacobian 
for the change of variable from u\ to x e are given by 
Eq. |l|: 



u 2 A 2 



Ai 



dx e 



1 

AT 



such that: 



P(x e |A) = / du 2 f z (u 2 ) 



/ X (- 



-U2A2 



Ai 



Ax 



(18) 



(19) 



If both / 1 (ui) and f 2 (u 2 ) are Gaussian distribution 
then we recover the expected result, as in Eq[fi| Note 
that this expected result is recovered if the uncertain- 
ties are Gaussian distributed and the relationship be- 
tween the theory, the data and the uncertainties are 
given by Eq. If that relationship is more complex 
there is no guarantee to recover Eq. [H]. In the general 
case, the integral in Eq. ^ has to be done numerically. 

3.3. Generalization: 

We are now ready to give a generalization of the cal- 
culation of the likelihood. We are considering N b s 
observables, and N unc uncertainties (statistical and 
systematic) parametrised by N unc random variables 
{u} = ui, u 2 , . . . , Ujs; une with their own distributions, 

There are N a b s relations between {a;*}, {x e } and {u}, 
one for each observable: 



F i (x e i ,{x t (X)},M) = 0. 



(20) 



This gives N unc —N b s independent that we choose 
by convenience to be the u[s corresponding to the 
systematic uncertainties. Without loosing generality 
we assume that there is one statistical uncertainty 
for each observable, and we organize the correspond- 
ing Ui with the same index as x\, such that the last 
N sys (= N unc — N i, s ) ^ are the random variables for 
the systematic uncertainties. For each set of measured 
{x e } there is an infinite number of {u} sets that cor- 
respond to it. 

The probability to find {x e } in an element of volume 
rii=i s dx? given that the theory is fixed by {A} is equal 
to the probability to find the {u} in a corresponding 
element of volume 
the independent u 

N, 



N 



= " 1 Tlc dx u , with an integration over 



N ob3 N oba N unc 

p({x e }\\)i[dxt = (iidu fc ) /( n 

A — 1 J 1 J A — AT _ 



dui) 



k=l 



=N oba + l 



* n 



(21) 



3 if there are correlations between the Ui replace Y\ ."™ c fi[uj) 



by /(«l)«2) ■ 



, ujq ) the global probability distribution of the 
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The values of the {ui, i = 1, N a t s } (corresponding to 
the statistical uncertainties) and the Jacobian, J(u — > 
x e ), for the change of variable from those Ui to the 
x\ can be extracted from the N a b s relations in Eq. |2^. 
The likelihood is then given by: 




Often, the Fi relationship in Eq |20| have a simple 
dependence on {x e } and the u's corresponding to the 
statistical uncertainties: 

Fi(tf,{x*(\)}, {u}) = xl + Ul A. t + • • • , (23) 

where the Aj are the size of the statistical uncertain- 
ties. In that case, the Jacobian is simply given by: 

Nob. 

J(u^x*)=H- (24) 

2 — 1 1 

In most cases, the likelihood will not be analytically 
calculable, and has to be calculated numerically again 
with Monte Carlo technique. 

In order to be able to calculate the likelihood we 
therefore need: 

• the relations between {x 1 }, {x e } and {u} as in 
Eq. g 

• the probability distribution of the random vari- 
able associated with the uncertainties: f l {ui). 

Unfortunately most of the time that information is 
not reported by the experimenters, and/or is not avail- 
able and certainly difficult to extract from papers. It 
is only in the case that all the uncertainties are Gaus- 
sian distributed [] that it is sufficient to report the 
size of the uncertainties and their correlation R This 
is a very important issue, simply put, experiments 
should always provide a way to calculate the likelihood, 
P({x e }\\). This last fact was also the unanimous con- 
clusion of a recent workshop on confidence limits held 
at CERN [ fjj . This is particularly crucial when com- 
bining different experiments together: the pull of each 
experiment will depend on it and, as a result, so will 
the central values of the deduced PDFs. 

3.4. The central limit theorem 

Assuming that the uncertainties are Gaussian dis- 
tributed when they are not can lead to some serious 

9 or can be considered as Gaussian distributed, see later 
10 with an explicit statement that the uncertainties can be as- 
sumed to be Gaussian distributed 



problems. For example, minimizing the x 2 constructed 
assuming Gaussian distribution will not even maximize 
the likelihood. Indeed in the general case, the usually 
defined \ 2 will n ot appear in the likelihood. 

It is often assumed that the central limit theorem 
can be used to justify the assumption of Gaussian dis- 
tribution for the uncertainties. It is therefore useful to 
revisit this theorem. Y is a linear combination of n 
independent Xf. 

Y = J2 C * X > ( 25 ) 

i 

2 \ 2 2 

i 

where the Cj are constants and the a are the standard 
deviations. The theorem states that in the limit of 
large n the distribution of Y will be approximately 
Gaussian if a\ is much larger than any component 
c i a x f rom a non-Gaussian distributed A", ;. For some 
examples of how large n has to be, see Ref. [ Q|. 

Here is one way the theorem could be used: If the 
Fi relations are given by: 

N u „ c 

x*(A) =x\+ ^ u k A lk 
fe=i 

and if there is a large number of uncertainties, the 
u k are independent and none of the An- for a non- 
Gaussian-like Ufc dominate then we know that the sum 
will be approximately Gaussian distributed. One way 
to express this fact is simply to assume that all the 
uncertainties are Gaussian distributed. In this case, 
we recover the usual expression for the likelihood. 

A direct consequence is that if there are a few un- 
certainties that dominate a measurement, then we cer- 
tainly need to know their distribution. See Ref. [||, for 
an example of a non-Gaussian dominant uncertainty in 
a real life experiment. 

3.5. Luminosity Uncertainty 

We now turn to the calculation of the likelihood 
when there is a normalization uncertainty, like the Lu- 
minosity uncertainty. The F relation of Eq. ^o] is given 
by: 

C\ = x e +uiAi, (26) 

where we have assumed that we are measuring the pa- 
rameter directly, x 1 — A. The Luminosity, C, has also 
an uncertainty: 

£ = £ + u 2 A 2 . (27) 

We assume that both u± and u 2 are Gaussian dis- 
tributed. Replacing Eq. ^ in Eq. we obtain: 

C Q A - x e = mi Ai - u 2 A 2 x l . (28) 
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This expression shows that CqX — x e is the sum of 
two Gaussian, such that the likelihood is a Gaussian 
distribution with the standard deviation given by: 

(29) 



Correlation of CX S vs A_ 



a = Af + (A 2 x t ) 



t 



The systematic uncertainty due to the Luminosity un- 
certainty is proportional to the theory. Explicitly: 



P(x e |A) 



1 



2(A?+(A 2 A) 2 ) 



2^v/A? + (A 2 A) 2 



(30) 



This result can also be derived from the general ex- 
pression of the likelihood, after doing the appropriate 
integral analytically. 

A few remarks are in order. In this case, eventhough 
all the uncertainties are Gaussian distributed, the min- 
imization of the x 2 would not maximize the likelihood 
because the theory appears in the normalization of the 
likelihood. Another mistake that leads to problems in 
this case is to replace A by x e /Co in the uncertainty. 
This mistake leads to a downwards bias. If x e has a 
downward statistical fluctuation, a smaller systematic 
uncertainty is assigned to it, such that when it is com- 
bined with other measurements, it is given a larger 
weight than it should. 

This example shows clearly that we have to know if 
the uncertainties are proportional to the theory or to 
the experimental value. Assuming one when the other 
is correct can lead to problems. It is clear that many 
other systematic uncertainties depend on the theory 
and that should also be taken into account. 

4. Sources of uncertainties 

There are many sources of uncertainties beside the 
experimental uncertainties. They either have to be 
shown to be small enough to be neglected or they need 
to be included in the PDF uncertainties. For exam- 
ples: variation of the renormalization and factorization 
scales; non-perturbative and nuclear binding effects; 
the choice of functional form of the input PDF at the 
initial scale; accuracy of the evolution; Monte-Carlo 
uncertainties; and dependence on theory cut-off. 

5. Current fit 

Draconian measures were needed to restart from 
scratch and re-evaluate each issue. We fixed the renor- 
malization and factorisation scales, avoided data af- 
fected by nuclear binding and non-perturbative effects, 
and use a MRS-style parametrization for the input 
PDFs. The evolution of the PDFs is done by Mellin 
transform method, see Ref. [ || . All the quarks are con- 
sidered massless. We imposed a positivity constraint 
on F2. A positivity constraint on other "observables" 
could also be imposed. 




Figure 9. Correlation between two of the parameters: 
a s and A g , see the text for their definition. Constant 
probability density levels are plotted. 



At the moment we are using HI and BCDMS (pro- 
ton data) measurement of F$ for our core set. In order 
to be able to use these data we have to assume that 
all the uncertainties are Gaussian distributed We 
then can calculate the X 2 W an d -P(A) (« exp— x 2 /2) 
with all the correlations taken into account ^|. We 
generated 50000 unit-weighted PDFs according to the 
probability function. For 532 data points, we obtained 
a minimum \ 2 of 530 for 24 parameters. We have 
plotted in Fig. |^, the probability distribution of some 
of the parameters. Note that the first parameter is a s - 
The value is smaller than the current world average. 
However, it is known that the experiments we are us- 
ing prefer a lower value of this parameter, see Ref. [ 
[n| , and as already pointed out, our current uncertain- 
ties are lower limits. Note that the distribution of the 
parameter is not Gaussian, indicating that the asymp- 
totic region is not reached yet. In this case, the blind 
use of the so-called chi-squared fitting method might 
be misleading. 

From this large set of PDFs, it is straightforward 
to plot, for example, the correlation between different 
parameters and to propagate the uncertainties to other 
observables. In Fig. ^, the correlation between a s and 
A s is presented. A g parametrizes the small Bjorken- 
x behavior of the gluon distribution function at the 



11 no information being given about the distribution of the 
uncertainties 

12 here we assumed that none of the systematic uncertainties 
depend on the theory 
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Figure 8. Plot of the distribution (black histograms) of four of the parameters. The first one is a s , the strong 
coupling constant at the mass of the Z-boson. The line is a Gaussian distribution with same average and standard 
deviation as the histogram 
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Figure 10. Correlation between the production cross 
sections for the W and Z vector bosons at the Teva- 
tron, aw and <Jz (in nbarns, includes leptonic branch- 
ing fraction). The solid and dashed lines show the 
constraint due to the CDF measurement of the cross 
section ratio. 



initial scale: xg(x) ~ x~ Xa . The lines are constant 
probability density levels that are characterized by a 
percentage, a, wich is defined such that 1 — a is the 
ratio of the probability density corresponding to the 
level to the maximum probability density. 

In Fig. [l(], we show the correlation between two 
observables, the production cross sections for the W 
and Z vector bosons at the Tevatron along with the 
experimental result from CDF. The constant probabil- 
ity density levels are shown. The agreement between 
the theory and the data is qualitatively good. 



Figure 11. Data-theory for the lepton charge asymme- 
try in W decay at the Tevatron. 



In Fig. [ll], we present data-theory for the lepton 
charge asymmetry in W decay at the Tevatron. The 
data are the CDF result [ |ll| and the theory corre- 
spond to the average value over the PDF sets for each 
data point, as defined in Eq. [?]. The dashed line are 
the theory plots corresponding to the one standard de- 
viation over the PDF sets, also defined in Eq. 0. The 
inner error bars are the statistical and systematic un- 
certainties added in quadrature^. The outer error bar 
correspond to the experiment and theory uncertainties 
added in quadrature. The theory uncertainty is the un- 



13 The distribution of the uncertainties and the point to point 
correlation of the systematic uncertainties were not published 
such that we had to assume Gaussian uncertainties and no 
correlation 
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Correlation of (J w vs C7 z 




Figure 12. Same as in Fig. |l(] for the weighted PDFs. 



certainty associated with the Monte-Carlo integration, 
the factorization and renormalization scale dependence 
are small and can be neglected. 5000 PDFs were used 
to generate this plot. It is well known that the data 
we have included so far in our fit mainly constraint the 
sum of the quark parton distribution weighted by the 
square of the charges. The lepton charge asymmetry 
is sensitive to the ratio of up-type to down-type quark 
and is therefore not well constraint. We can add this 
data set by simply weighting each PDF from our set 
with the likelihood of the new data. The resulting new 
range of the theory (calculated with weighted sums) is 
given by the band of solid curves in Fig O. 

The effect of the inclusion of the lepton charge asym- 
metry can be seen in Fig. [lj, where the correlation be- 
tween the W and the Z cross section is shown again 
but for the weighted PDFs. The agreement with the 
data is better than before, but the probability density 
has now two maxima. 

It has been argued that for Run II at the Tevatron, 
the measurement of the number of W and Z produced 
could be used as a measurement of the Luminosity. 
That of course requires the knowledge of the cross sec- 
tion with a small enough uncertainties. In Fig. the 
luminosity probability distribution is presented for the 
unit-weighted and weighted PDF sets along with the 
the luminosity used by CDF. The plot for the weighted 
set has also two maxima, has in Fig. [l2|. 

5.1. Conclusions 

In conclusion, we remind the reader again that 
all the results should be taken as illustration of the 



Figure 13. Probability distribution of the luminosity 
(runla in pb~ 1 ) for the unit-weighted (right plot) and 
weighted (middle plot) PDFs, compared to the value 
used by CDF (left plot). 



method and that not all the uncertainties have been 
included in the fitting. 
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EXPERIMENTAL UNCERTAINTIES AND 
THEIR DISTRIBUTIONS IN THE 
INCLUSIVE JET CROSS SECTION. 

R. Hirosky 
University of Illinois, Chicago, IL 60607 

1. Introduction 

This workshop has been an important channel of 
communication between those performing global par- 
ton distribution function (pdf) fits and the experimen- 
tal groups who provide the data at the Tevatron. In 
the particular case of jets analyses we have initiated a 
detailed dialog on the sources and distributions of ex- 
perimental uncertainties. As part of my participation 
in the workshop, I have used the D0 inclusive jet cross 
section as an example of a jet measurement with a com- 
plex ensemble of uncertainties and have provided de- 
scriptions of each component uncertainty. Such dialogs 
will prove crucial in obtaining the best constraints on 
allowable pdf models from the data. 

2. Uncertainties on the CDF and D0 inclusive 
jet cross sections 

In the first meeting we summarized the jet inclusive 
cross section measurements from the D0 [ [lj and CDF [ 

experiments. In particular, we illustrated the major 
corrections applied to the data, namely jet Et scale 
and Et resolution corrections, as well as the deriva- 
tion methods for these corrections employed by each 
experiment. To review these methods see [ ||-[ [| and 
references therein. 

The uncertainties by component in the CDF and 
D0 inclusive jet cross sections are shown in Figs.fbl]- 
fl5l Each component of the uncertainty reported for 
the CDF cross section is taken to be completely cor- 
related across jet Et, while individual components are 
independent of one another. The D0 uncertainties 
(shown here symmetrized) are also independent of one 
another, however each component may be either fully 
or partially correlated across jet Et- In the case of the 
energy scale uncertainty the band shown is constructed 
from eight subcomponents. 

2.1. Comparisons with theory 

The two experiments have used various means to 
compare their measurements to theoretical predictions. 
CDF has published a comparison of their cross sec- 
tion to a next-to-leading order (NLO) QCD calcula- 
tion using a variety of pdf models by means of var- 
ious normalization-insensitive, shape-dependent sta- 
tistical measures [ || (Kolmogorov-Smirnov, Cramer- 
VonMises, Anderson-Darling). D0 has formulated a 
covariance matrix using each uncertainty component 
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Figure 14. Uncertainties by component in the CDF 
inclusive jet cross section, 

l/(Ar)AE T ) J J d 2 a/(dE T dr])dETdr], 0.1 < \r)\ < 0.7 



in the cross section and its Et correlation information 
and employed a x 2 test to compare to NLO QCD [ 
|l| . It is difficult to generalize the various shape statis- 
tics to include non-trivial correlations in the systematic 
uncertainties and although correlations may be easily 
added to a covariant error matrix \ 2 tests can show 
biases when faced with correlated scale errors. Ref- 
erence [ [| illustrates how correlated scale errors may 
lead to biases in parameter estimation by noting that 
systematic errors reported as a fraction of the observed 
data can be evaluated as artificially small when applied 
to a point that fluctuates low. This bias may be mit- 
igated by parameterizing the systematic scale errors 
as percentages of a smooth model of the data or by 
placing them on the smooth theory directly (see con- 
tributions to these proceedings by W. Giele, S. Keller, 
and D. Kosower). 

Other difficulties arise in interpretation of x 2 proba- 
bilities when uncertainties show large correlations. The 
probability that a prediction agrees with the data for 
a given x 2 is calculated assuming that the x 2 follows 
the distribution: 



f(x;n) 



2(™/ 2 )r(n/2) 



(31) 



where n is the number of degrees of freedom of the 
data set. The probability of getting a worse value of 
X 2 than the one obtained for the comparison is given 
by: 
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P( X 2 ;n) 



f(x; n)dx 



(32) 



Hence, to verify the accuracy of the probabilities 
quoted in the recent D0 cross section papers (inclu- 
sive jet cross section [[lj and dijet mass spectrum [|?|), 
the x 2 distribution may be compared to Equation 3l| 
with the appropriate number of degrees of freedom. 
The \ 2 distribution for the D0 dijet mass spectrum 
was tested by developing a Monte Carlo program [ |8| 
that generates many trial experiments based an ansatz 
cross section determined from the best smooth fit to 
the data (with a total of 15 bins, or 15 degrees of free- 
dom). The first step generated trials based on sta- 
tistical fluctuations taking the true number of events 
per bin as given by the ansatz cross section. The trial 
spectra were then generated for each bin according to 
Poisson statistics. The x 2 for each of these trials was 
calculated using the difference between the true and 
the generated values. Figure [l6] (solid curve) shows 
the x 2 distribution for all of the generated trials. The 
distribution agrees well with Equation [jl| for 15 degrees 
of freedom. The next step assumes that the uncertain- 
ties correlated as in the measurement of the dijet mass 
cross section. Trial spectra are generated using these 
uncertainties to generate a x 2 distribution (see the dot- 
ted curve in Fig. |l6|). It is clear that x 2 distribution 
very similar to the curve predicted by Equation Bj. 
Hence, any probability generated using Equation |32| 
will be approximately correct. The resulting x 2 dis- 
tribution was fitted by Equation and the resulting 
fit is consistent with the distribution if 14.6 degrees of 
freedom are assumed. 

A similar test using the D0 inclusive jet cross section 



Figure 16. x 2 distribution for random fluctuations 
around the nominal D0 Dijet Mass cross section. 
(Solid) Errors are fluctuated as uncorrelated. (Dashed) 
Et correlations are included. 



finds the distributions shown in Fig. [l7]. The two distri- 
butions agree well for x 2 values below approximately 
15 and then begin to diverge slowly. The distribu- 
tion based on the cross section uncertainties includes 
a larger tail than the x 2 distribution generated with 
the wholly uncorrelated uncertainties, implying that 
probabilities based on a x 2 analysis will be slightly un- 
derestimated. See also the talks by B. Flaugher in this 
workshop for additional observations and comments on 
X 2 analyses. 

3. Beyond the Normal assumption 

Independent of any difficulties due to correlated un- 
certainties, a x 2 test necessarily relies on the assump- 
tion that the uncertainties follow a normal distribution. 
This may be a reasonable approximation in some cases. 
Upon close inspection we expect this assumption to 
be generally false for most rapidly varying observables 
(i.e. steeply falling cross section measurements). Per- 
haps, as in the most obvious case, some experimental 
uncertainties will simply be non-Gaussian in their dis- 
tribution and furthermore symmetric uncertainties in 
the abscissa variable will develop into asymmetric un- 
certainties when propagated through to the measured 
distribution. The latter case is illustrated as follows. 
Consider an £>p-independent jet Et scale error of 2%. 
What is it's effect on an inclusive jet cross section ver- 
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Figure 17. \ 2 distribution for random fluctuations 
around the nominal D0 inclusive jet cross section. 
(Solid) errors are fluctuated as uncorrelated. (Dashed) 
Et correlations are included. 



sus Et? Jets are shifted bin-to-bin by fluctuating their 
Et values within the 2% range and as a result of the 
steeply falling cross section, more jets from low Et 
values are shifted into higher Et bins by one extreme 
of this scale uncertainty than the in reverse shift for 
higher Et jets. Figure. |l8| shows how a flat 2% Et 
scale uncertainty alters the measured cross section us- 
ing a smooth fit to the D0 data as the nominal cross 
section model. In general the degree of this asymmetry 
will depend on the steepness of the measured distribu- 
tion. In order to define a covariance matrix, such errors 
are typically symmetrized. 

The use of an approximate covariance matrix will 
also result in a loss of sensitivity when errors are shown 
to follow distributions with tails smaller than in a nor- 
mal distribution. As an example we show a correction 
factor with uncertainties of this type from the D0 jet 
cross section analysis in Fig. [l9| This figure shows the 
hadronic response correction for jets as a function of 
jet energy. The correction is derived from an analysis 
of 7 + jet data [ The bands delimit regions that 
contain ensembles of deviations from the nominal re- 
sponse within certain confidence limits. It is evident 
that in this case assuming the uncertainty follows a 
normal distribution with variance equal to the 68% 
limits shown will tend of underestimate the sensitiv- 
ity of the data for excluding certain classes of theories. 
Figure pOl shows the range of cross section uncertainty 



Figure 18. Example of a 2% Et scale error propagated 
through an inclusive jet cross section measurement. 



due to the response component only as a function of 
confidence level for several Et values of the D0 cross 
section. 

4. Application to pdf constraints 

In this workshop W. Giele, S. Keller, and D. Kosower 
have reported on a method for extracting pdf distribu- 
tions with quantitative estimates of pdf uncertainties. 
In effect their method [ || uses a Bayesian approach 
that integrates sets of pdf parameterizations over prop- 
erly weighted samples of experimental uncertainties to 
produce a set of pdf models consistent with the data 
within a given confidence level. The basic method may 
be extended to use data with arbitrary error distribu- 
tions and correlations. For such methods to function 
reliably the experiments must be able to provide de- 
tailed descriptions of their error distributions. Giele 
et al. make a distinction between 'errors on the data' 
and 'errors on the theory' for estimation of the most 
likely pdf models. In this context we take only uncer- 
tainties depending directly on the number of events in 
a bin as 'errors on the data'. Other typical sources of 
uncertainty, luminosity, energy scale, resolution, etc., 
may be treated as 'errors on the theory' in that they 
are in some sense independent of the statistical pre- 
cision of the data and represent how an underlying, 
true, distribution may be distorted by observation in 
the experiment. 

As a result of these dialogs, we have revisited the D0 
response uncertainty (our largest uncertainty in the in- 
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Figure 19. D0 Jet response correction versus Energy. 
The outer bands show the extreme deviation in re- 
sponse at a given confidence level as a function of jet 
energy. 



elusive jet cross section measurement) from Fig. [jj] and 
generated a sampling of the probability density func- 
tion for distributions in it parameters. This probability 
density function contains all the relevant information 
on both the shape of the uncertainty distribution and 
point-to-point correlations. It is clear that providing 
such information is a significant enhancement from tra- 
ditional methods of summarizing experimental uncer- 
tainties. Optimum utilization of the data demands a 
detailed understanding and reporting of its associated 
uncertainties. Through our fruitful discussions in this 
workshop, we look forward to setting an example for 
the reporting of experimental uncertainties and to fully 
exploiting our cross section data in pdf analyses in the 
near future. 



Figure 20. D0 response uncertainty propagated 
through the inclusive jet cross section measurement at 
various Et values. The solid bands represent extreme 
variations at various confidence levels. The dashed 
bands illustrate the overestimation of these variations 
by using a Gaussian approximation. 
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PARTON DENSITY UNCERTAINTIES AND 
SUSY PARTICLE PRODUCTION 

T. Plehn a '0 and M. Kramer 6 ^ 

a) Department of Physics, University of Wisconsin, 
Madison WI 53706, USA; b) Department of Physics 
and Astronomy, University of Edinburgh, Edinburgh 
EH9 3JZ, Scotland. 

Abstract 

Parton densities are important input parameters for 
SUSY particle cross section predictions at the Teva- 
tron. Accurate theoretical estimates are needed to 
translate experimental limits, or measured cross sec- 
tions, into SUSY particle mass bounds or mass deter- 
minations. We study the PDF dependence of next-to- 
leading order cross section predictions, with emphasis 
on a new set of parton densities [ [lj. We compare 
the resulting error to the remaining theoretical uncer- 
tainty due to renormalization and factorization scale 
variation in next-to-leading order SUSY-QCD. 

1. Introduction 

The search for supersymmetric particles is among 
the most important endeavors of present and future 
high energy physics. At the upgraded pp collider Teva- 
tron, the searches for squarks and gluinos (and espe- 
cially the lighter stops and sbottoms), as well as for 
the weakly interacting charginos and neutralinos, will 
cover a wide range of the MSSM parameter space [ 

The hadronic cross sections for the production of 
SUSY particles generally suffer from unknown theo- 
retical errors at the Born level [ H) ■ For strongly in- 
teracting particles the dependence on the renormaliza- 
tion and factorization scale has been used as a measure 
for this uncertainty, leading to numerical ambiguities 
of the order of 100%. For Drell-Yan type weak pro- 
duction processes the dependence on the factorization 
scale is mild. However, a comparison of leading and 
next-to-leading order predictions [ || reveals that the 
impact of higher-order corrections is much larger than 
the estimate through scale variation would have sug- 
gested. The use of next-to-leading order calculations [ 
[| q, [7j is thus mandatory to reduce theoretical un- 



supported in part by DOE grant DE-FG02-95ER-40896 and 
in part by the University of Wisconsin Research Committee with 
funds granted by the Wisconsin Alumni Research Foundation 
15 Supported in part by the EU Fourth Framework Programme 
'Training and Mobility of Researchers', Network 'Quantum 
Chromodynamics and the Deep Structure of Elementary Par- 
ticles', contract FMRX-CT98-0194 (DG 12 - MIHT) 



certainties to a level at which one can reliably extract 
mass limits from the experimental data. 

In addition to the scale ambiguity and the impact of 
perturbative corrections beyond next-to-leading order, 
hadron collider cross section are subject to uncertain- 
ties coming from the parton densities and the asso- 
ciated value of the strong coupling. Previously, the 
only way to estimate the PDF errors was to compare 
the best-fit results from various global PDF analyses. 
Clearly, this is not a reliable measure of the true un- 
certainty. As a first step towards a more accurate error 
estimate, the widely used sets CTEQ [ § and MRST [ 
|j| now offer different variants of PDF sets, e.g. using 
different values of the strong coupling constant. In this 
letter we compare their predictions to the preliminary 
GKK parton densities [ GJ, which provide a system- 
atic way of propagating the uncertainties in the PDF 
determination to new observables. 

2. Stop Pair Production 

For third generation squarks the off-diagonal left- 
right mass matrix elements do not vanish, but lead to 
mixing stop (and sbottom) states. The lighter mass 
eigenstate, denoted as 1 1 , is expected to be the lightest 
strongly interacting supersymmetric particle. More- 
over, its pair production cross section, to a very good 
approximation, only depends on the stop mass, in con- 
trast to the light flavor squark production. Neverthe- 
less, considering the different decay channels compli- 
cates the analyses [ |[ [h]]. At the Tevatron the frac- 
tion of stops produced in quark-antiquark annihilation 
and in gluon fusion varies strongly with the stop mass. 
Close to threshold the valence quark luminosity is dom- 
inant, but for lower masses a third of the hadronic cross 
section can be due to incoming gluons [ |?J . 

In Figure pl| we compare the total ti-pair production 
cross sections for three sets of parton densities: only for 
incoming quarks do the CTEQ4 and MRST99 results 
lie on top of each other. For gluon fusion the corre- 
sponding cross sections differ by ~ 10%. The GKK set 
centers around a significantly smaller value. This is in 
part due to the low average value (a s (GKK)) = 0.108, 
which is expected to increase after including more ex- 
perimental information in the GKK analysis. But even 
the normalized cross section a/a 2 s is still smaller by 
35% compared to CTEQ4 and MRST99 because of 
the entangled fit of the strong coupling constant and 
the parton densities. However, the width of the Gaus- 
sian fit to the GKK results gives an uncertainty of 
2% and 8% for the quark-antiquark and gluon fusion 
channel, similar to the difference between CTEQ4 and 
MRST99. 



For heavier stop particles, Figure 22, the gluon lu- 
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minosity is strongly suppressed due to the large final 
state mass, and mainly valence quarks induced pro- 
cesses contribute to the cross section. The Gaussian 
distribution of the GKK results has a width of ~ 2%. 
The comparably large difference between CTEQ4 and 
MRST99 is caused by the small fraction of gluon in- 
duced processes, since the gluon flux at large values of 
x differs for CTEQ4 and MRST99 by approximately 
40%. 

3. Chargino/Neutralino Production 

The production of charginos and neutralinos at the 
Tevatron is particularly interesting in the trilepton 
X2X1 aim the light chargino X1X1 channels [|Pj|. The 
next-to-leading order corrections to the cross sections [ 
D reduce the factorization scale dependence, but at the 
same time introduce a small renormalization scale de- 
pendence. A reliable estimate of the theoretical error 
from the scale ambiguity will thus only be possible be- 
yond next-to-leading order. 

The Gaussian distribution of the GKK parton 
densities for light chargino pairs is shown in Fig- 
ure For the chosen mSUGRA parameters (mo = 
100 GeV, A = 300 GeV, m 1/2 = 150 GeV) the width is 
~ 2%, as one would expect from the quark-antiquark 
channel of the stop production. But in contrast to 
the stop production, where all quark luminosities add 
up, the chargino/neutralino channels can be extremely 
sensitive to systematic errors in different parton den- 
sities due to destructive interference between s and t 
channel diagrams. The total trilepton cross section for 
example will therefore be a particular challenge for a 
reliable error estimate. 

4. Outlook 

We have briefly reviewed the status of the theoretical 
error analysis of SUSY cross sections at the Tevatron. 
For strongly interacting final state particles, the inclu- 
sion of next-to-leading order corrections reduces the 
renormalization and factorization scale ambiguity to a 
level < 10% where the size of the PDF errors becomes 
phenomenologically relevant. We have compared dif- 
ferent recent PDF sets provided by the CTEQ [ |j| and 
MRST [ H collaborations to the preliminary GKK par- 
ton densities [ [jj . The large spread in the cross section 
predictions can mainly be attributed to the low aver- 
age value of the strong coupling associated with the 
GKK sets. We expect this spread to be reduced once 
more data have been included in the GKK analysis 
and the corresponding average value of the strong cou- 
pling becomes closer to the world average. For weak 
supersymmetric Drell-Yan type processes [ || the scale 



dependence at NLO cannot serve as a measure for the 
theoretical error since the renormalization scale depen- 
dence is only introduced at NLO. The PDF induced 
errors for e.g. the case of X1X1 production are small; 
however, interference effects between the different par- 
tonic contributions must be taken into account. 

The recently available variants of PDF sets provided 
by CTEQ and MRST and, in particular, the GKK par- 
ton densities allow for the first time a systematic explo- 
ration of PDF uncertainties for the prediction of SUSY 
particle cross sections. The preliminary GKK results 
do not yet allow a conclusive answer, but they point 
the way towards a complete and reliable error analysis 
in the near future. 
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Figure 23. NLO production cross section for the 
chargino channel xtxi- The Gaussian fits the pre- 
liminary GKK parton densities. 
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Figure 21. NLO production cross section for a light 
stop. The Gaussian fits the preliminary GKK parton 
densities. The renormalization/factorization scale is 
varied around the average final state mass. 
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Figure 22. NLO production cross section for a heav- 
ier stop, dominated by incoming valence quarks. The 
Gaussian fits the preliminary GKK parton densi- 
ties. The renormalization/factorization scale is varied 
around the average final state mass. 
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Abstract 

Parton distribution functions are determined by the 
comparison of finite-order calculations with data. We 
briefly discuss the interplay of higher order corrections 
and PDF determinations, and the use of soft-gluon re- 
summation in global fits. 

1. Factorization &; the nlo model 

A generic inclusive cross section for the process A + 
B — ► F + X with observed final-state system F, of 
total mass Q, can be expressed as 



Q 



4 daAB^FX 



dQ 2 



® Oab^FX (z,Q,fi) , (33) 



with z = Q 2 /x a XbS. The a a b are partonic hard- 
scattering functions, a = OBorn + {c e s(n 2 )/'x)&^ + • • ■ • 
They are known to NLO for most processes in the stan- 
dard model and its popular extensions. Corrections be- 
gin with higher, uncalculated orders in the hard scat- 
tering, which respect the form of Eq. ( [33] ) . The discus- 
sion is simplified in terms of moments with respect to 
r = Q 2 /S, 



cab^fx = I dr t n 1 Q 4 do ab^fx /dQ 2 
Jo 

M 2 ) &ab^Fx{N, Q, fl) (p b / B (N, [I 2 ) ,(34) 

a j 6 
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where the moments of the </>'s and & a b^FX are defined 
similarly. 

Eqs. ( |33| ) and (Q) are starting-points for both the 
determination and the application of parton distribu- 
tion functions (PDFs), 0,yjj, using 1-loop a's [[l], [|, [!| 
We may think of this collective enterprise as an "NLO 
model" for the PDFs, and for hadronic hard scattering 
in general. For precision applications we ask how well 
we really know the PDFs [ g, |, §. Partly this is a 
question of how well data constrain them, and partly 
it is a question of how well we could know them, given 
finite-order calculations in Eqs. (^) and ([m]). We will 
not attempt here to assign error estimates to theory. 
We hope, however, to give a sense of how to distin- 
guish ambiguity from uncertainty, and how our partial 
knowledge of higher orders can reduce the latter. 

2. Uncertainties, schemes &: scales 

It is not obvious how to quantify a "theoretical un- 
certainty" , since the idea seems to require us to esti- 
mate corrections that we haven't yet calculated. We 
do not think an unequivocal definition is possible, but 
we can try at least to clarify the concept, by consid- 
ering a hypothetical set of nuclcon PDFs determined 
from DIS data alone [ . To make such a determina- 
tion, we would invoke isospin symmetry to reduce the 
set of PDF's to those of the proton, <p a /p, and then 
measure a set of singlet and nonsinglet structure func- 
tions, which we denote F®. Each factorized structure 
function may be written in moment space as 

F« (N, Q) = J2 CaHN, Q, M) k/p(N, fi 2 ) , (35) 

a 

in terms of which we may solve for the parton distri- 
butions by inverting the matrix C, 

4> a/P (N,v 2 ) = J2 C-^(N,Q^) F®(N,Q). (36) 

i 

With "perfect" F's at fixed Q, and with a specific 
approximation for the coefficient functions, we could 
solve for the moment-space distributions numerically, 
without the need of a parameterization. In a world 
of perfect data, but of incompletely known coeffi- 
cient functions, uncertainties in the parton distribu- 
tions would be entirely due to the "theoretical" uncer- 
tainties of the C's: 

54> a/P (N,n) = J2^- 1 ^(N,Q, fl ) F®(N,Q). (37) 

i 

Our question now becomes, how well do we know the 
C's? In fact this is a subtle question, because the coef- 
ficient functions depend on choices of scheme and scale. 



Factorization schemes are procedures for defining co- 
efficient functions perturbatively. For example, choos- 
ing for F2 the LO (quark) coefficient function in Eq. 
( pf ) defines a DIS scheme (with C independent of /i, 
which is then to be taken as Q in <ft). Computing the 
C's from partonic cross sections by minimal subtrac- 
tion to NLO defines an NLO MS scheme, and so on. 
Once the choices of C's and /i are made, the PDF's are 
defined uniquely. 

Evolution in an MS or related scheme, enters 
through 

H^a/niN,^) = -r ab (N,a s (Li 2 ))0 b/H (N,fi 2 ) 

Hj-C®(N,Q,fi) = C i d l \N,Q^)T dc (N,a s ( f i 2 )). (38) 

In principle, by Eq. (|38|), the scale-dependence of the 
Co exactly cancels that of the PDFs in Eq. ([35]) and, 
by extension, in Eq. (|33"|). This cancelation, however, 
requires that each C and the anomalous dimensions T 
be known to all orders in perturbation theory. 

To eliminate //-dependence up to order we 
need er to order a™ and the r a f, to a" +1 . One-loop 
(NLO) QCD corrections to hard scattering require two- 
loop splitting functions, which are known. The com- 
plete form of the NNLO splitting functions, is still 
somewhere over the horizon [ Even when these 
are known, it will take some time before more than a 
few hadronic hard scattering functions are known at 
NNLO. 

We can clarify the role of higher orders by relating 
structure functions at two scales, Qo and Q. Once we 
have measured F(N,Qo), we may predict F(N,Q) in 
terms of the relevant anomalous dimensions and coef- 
ficient functions by 

F(N,Q) = F(N,Q )e^^ N ^' 2)) 

x '£^QL\ . (39 ) 

C(N 7 Q ,Qo)_ 

This prediction, formally independent of PDFs and in- 
dependent of the factorization scale, has corrections 
from the next, still uncalculated order in the anoma- 
lous dimension and in the ratio of coefficient functions. 
The asymptotic freedom of QCD gives a special role to 
LO: only the one-loop contribution to T diverges with 
Q in the exponent, and contributes to the leading, log- 
arithmic scale breaking. NLO corrections already de- 
crease as the inverse of the logarithm of Q, NNLO 
as two powers of the log. Thus, the theory is self- 
regulating towards high energy, where dependence on 
uncalculated pieces in the coefficients and anomalous 
dimensions becomes less and less important. 
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The general successes of the NLO model strongly 
suggest that relations like (39) are well-satisfied for a 
wide range of observables and values of N (or x) in DIS 
and other processes. This does not mean, however, 
that we have no knowledge of, or use for, information 
from higher orders. In particular, near x = 1 PDFs are 
rather poorly known [ g). At the same time, the ratio 
of C's depends on N, and if a s In N is large, it becomes 
important to control higher-order dependence on In A. 
This is a task usually referred to as resummation, to 
which we now turn. 

3. Resummation 

Let us continue our discussion of DIS, describing 
what is known about the A-dependence of the coeffi- 
cient functions C, as a step toward understanding the 
role of higher orders. Specializing again for simplic- 
ity to nonsinglet or valence, the resummed coefficient 
function may be written as [ ^, |l^| 

C xes (N,Q,fi) = (7 NLO 



DIS e E DIS (.N,Q,v) 



(40) 



where "sub" implies a subtraction on (7 NLO to keep 
C res exact at order a s , and where Cf corresponds to 
the NLO A-independent ("hard virtual") terms. The 
exponent resums logarithms of TV: 



Edjb(N,Q,h) 



(41) 



2/ N A 4 

with A = Ae 7 - 5 , and with 

A(a s ) = 
a 



A{a a {^ 2 )) HNfi' 2 /Q 2 ) + B(a s {^ 2 )) 



C F 



1 



B(a s ) 



2tt 



2 2tt 



Ca 



67 
18 



7T~ 
"if 



10 



(42) 



Eq. (|4^) is accurate to leading (LL) and next-to-leading 
logarithms (NLL) in N in the exponent: a™ ln m+1 A 
and a™ In™ TV, respectively. The N dependence of the 
ratio C 2 CS (N,Q,Q)/C^ W (N,Q,Q) is shown in Fig. 
|H with Q 2 = 1, 5, 10, 100 GeV 2 . At N = 1 the 
ratio is unity. It is less than unity for moderate A, but 
then begins to rise, with a slope that increases strongly 
for small Q. At low Q 2 and large A, higher orders can 
be quite important. What does this mean for PDFs? 
We can certainly refit PDFs with resummed coefficient 
functions, and we see that the high moments of such 
PDFs are likely to be quite different from those from 

NLO fits. 

To get a sense of how such an NLL/NLO-MS scheme 
might differ from a classic NLO-MS scheme, we resort 
to a model set of resummed distributions, determined 
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Figure 24. Ratio of Mellin-7V moments of resummed 
and NLO MS-scheme quark coefficient functions for 
F2 . The numbers denote the value of Q 2 in GeV 2 . We 
have chosen ft = Q. 



as follows. We define valence PDFs in the resummed 
scheme by demanding that their contributions to F2 
match those of the corresponding NLO valence PDFs 
at a fixed Q = which is ensured by 



4>^(N,Q 2 Q ) = ^ h0 (N,Ql 



C 2 NLO (jV,Qo,Qo) 
C£»(JV,Qo,Qo) 



(43) 



Using the resummed parton densities from Eq. (fi"3|), 
we can generate the ratios F| es (a;, Q)/Ff lj °{x, Q). 

The result of this test, picking Qq — 100 GeV 2 is 
shown in Fig. for the valence F2(x, Q) of the pro- 
ton, with x = 0.55, 0.65, 0.75 and 0.85. The NLO 
distributions were those of [ |2) , and the inversion of 
moments was performed as in [ 0. The effect of re- 
summation is moderate for most Q. At small values 
of Q, and large x, the resummed structure function 
shows a rather sharp upturn. One also finds a gentle 
decrease toward very large Q [ [l2| . We could interpret 
this difference as the uncertainty in the purely NLO 
valence PDFs implied by resummation. 

From this simplified example, we can already see 
that the use of resummed coefficient functions is not 
likely to make drastic differences in global fits to PDFs 
based on DIS data, at least so long as the region of 
small Q 2 , of 10 GeV 2 or below, is avoided at very large 
x. At the same time, it is clear that a resummed fit 
will make some difference at larger x, where PDFs are 
not so well known. We stress that a full global fit will 
be necessary for complete confidence. 
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Figure 25. Ratio of the valence parts of the resummed 
and NLO proton structure function ^2(2;, Q 2 ), as a 
function of Q 2 for various values of Bjorken-rr. For 
F^, the 'resummed' parton densities have been de- 
termined through Eq. (ff3). 



4. Resummed hadronic scattering 

Processes other than DIS play an important role 
in global fits, and in any case are of paramount phe- 
nomenological interest. Potential sources of large cor- 
rections can be identified quite readily in Eq. (|34|). 
At higher orders, factors such as a s In 2 TV, can be as 
large as unity over the physically relevant range of z in 
some processes. In this case, they, and their scale de- 
pendence can be competitive with NLO contributions. 
Since they make up well-defined parts of the correction 
at each higher order, however, it is possible to resum 
them. To better determine PDFs in regions of phase 
space where such corrections are important, we may 
incorporate resummation in the hard-scattering func- 
tions that determine PDFs. 

The Drell-Yan cross section is the benchmark for the 
resummation of logs of 1 — z, or equivalently, logarithms 
of the moment variable N [ [|] , 



a%?(N,Q,(x) 



+0(1/N) . 



(44) 



The exponent is given in the MS scheme by 
E by (N,Q,vl) = 2 / -^A(a s (// 2 ))lniV 

JQ2/N 2 A* 



2/ N 2 H ■ 



'' lJ2 -A{a s {^))\n(^\, (4.-,) 



Q , 

with A as in Eq. (|42|), and where we have exhibited 
the dependence on the factorization scale, setting the 
renormalization scale to Q. Just as in Eq. ( f42| ) for 
DIS, Eq. ( ^5| ) resums all leading and next-to-leading 
logarithms of N. 

It has been noted in several phenomenological appli- 
cations that threshold resummation, and even fixed- 
order expansions based upon it, significantly reduce 
sensitivity to the factorization scale [ [l3| . To see why, 
we rewrite the moments of the Drell-Yan cross section 
in resummed form as 

= Yl <t>i/M /*) tf*(N, Q, n) (f> q/B (N, /*) 
9 



J2 h/A^ri 



,E m (N,Q,fj,)/2 



CBorn (Q) c, 



DY 

6 



The exponentials compensate for the IniV part of 
the evolution of the parton distributions, and the fi- 
dependence of the resummed expression is suppressed 
by a power of the moment variable, 

^ [h/AW'ti e E ^ N -^' 2 ] = 0(1/N) . (47) 

This surprising relation holds because the function 
A{a s ) in Eq. ji^ ) equals the residue of the 1/(1 — x) 
term in the splitting function P qq . Thus, the remaining 
iV-dependence in a resummed cross section still begins 
at order a 2 , but the part associated with the 1/(1 — x) 
term in the splitting functions has been canceled to 
all orders. Of course, the importance of the remain- 
ing sensitivity to fi depends on the kinematics and the 
process. In addition, although resummed cross sec- 
tions can be made independent of (i for all kiN, they 
are still uncertain at next-to-next-to leading logarithm 
in N, simply because we do not know the function A at 
three loops. Notice that none of these results depends 
on using PDFs from a resummed scheme, because MS 
PDFs, whether resummed or NLO, evolve the same 
way. The remaining, uncanceled dependence on the 
scales leaves room for an educated use of scale-setting 
arguments [ Q . The connection between resummation 
and the elimination of scale dependence has also been 
emphasized in [ [l5||. 

Scale dependence aside, can we in good conscience 
combine resummed hard scattering functions in Eq. 
(H) with PDFs from an NLO scheme? This wouldn't 
make much sense if resummation significantly changed 
the coefficient functions with which the PDFs were 
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originally fit. As Fig. ^5] shows, however, this is un- 
likely to be the case for DIS at moderate x. Thus, 
it makes sense to apply threshold resummation with 
NLO PDFs to processes and regions of phase space 
where there is reason to believe that logs are more im- 
portant at higher orders than for the input data to the 
NLO fits. 

At the same time, a set of fits that includes thresh- 
old resummation in their hard-scattering functions can 
be made [ |T(J , and their comparison to strict NLO fits 
would be quite interesting. Indeed, such a compari- 
son would be a new measure of the influence of higher 
orders. A particularly interesting example might be 
to compare resummed and NLO fits using high-pr jet 
data 

5. Power-suppressed corrections 

In addition to higher orders in a s (fi 2 ), Eq. (|33| ) has 
corrections that fall off as powers of the hard-scattering 
scale Q. In contrast to higher orders, these corrections 
require a generalization of the form of the factorized 
cross section. Often power corrections are parameter- 
ized as h(x)/[(l — x)Q 2 } in inclusive DIS, where they 
begin at twist four. In DIS, this higher twist term influ- 
ences PDFs when included in joint fits with the NLO 
and NNLO models, and vice-versa [|H|l7|, g||. As 
in the case with higher orders, such "power-improved" 
fits should be treated as new schemes. 

6. Conclusions 

The success of NLO fits to DIS and the studies of 
resummation above suggest that over most of the range 
of x, theoretical uncertainties of the NLO model are 
not severe. At the same time, to fit large x with more 
confidence than is now possible may require including 
the resummed coefficient functions. 

Resummation is especially desirable for global fits 
that employ a variety of processes, such as DIS and 
high-pr jet production, which differ in available phase 
space near partonic threshold. In a strictly NLO ap- 
proach, uncalculated large corrections are automati- 
cally incorporated in the PDFs themselves. As a re- 
sult, the NLO model cannot be expected to fit simulta- 
neously the large- a; regions of processes with differing 
logs of 1 — x in their hard-scattering functions, unless 
these higher-order corrections are taken into account. 

The results illustrated in the figures suggest that 
these considerations may be important in DIS with 
Q 2 below a few GeV 2 and at large x, where they may 
have substantial effects on estimates of higher twist in 
DIS. In hadronic scattering, large- N (x — ► 1) resum- 
mation, which automatically reduces scale dependence, 
may play an even more important role than in DIS. 
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EXPERIMENTAL DATA AND THEIR 
INTERPRETATION 
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1. Introduction 

The last few years have seen both new and improved 
measurements of deep inelastic and related hard scat- 
tering processes and invigorated efforts to test the lim- 
its of our knowledge of parton distributions (PDF) and 
assess their uncertainty. Recent global analysis fits 
to the wealth of structure functions and related data 
provide PDFs of substantial sophistication compared 
to the previous parametrizations [ [ H ■ The new 
PDF sets account for correlated uncertainty in strong 
coupling constant, variation from normalization uncer- 
tainty of data sets, theoretical assumptions regarding 
higher twists effects, initial parametrization form and 
starting Q Q value, etc. Range of potential variation 
in gluon density, strange and charm quark densities 
or, recently, also in d quark distribution [ [| are also 
provided. Participants of this Workshop in the PDF 
group primarily concentrated on finding new ways of 
inclusion of systematic uncertainties associated with 
experimental data into the framework of global anal- 
yses. Development in likelihood calculation by Giele, 
Keller, and Kossover, studies by CDF and DO collabo- 
rators, and a parallel work of CTEQ collaboration are 
presented in these proceedings. 

New or improved results from several experiments 
have contributed to better knowledge of PDFs, how- 
ever, there are still areas where the interpretation of ex- 
perimental data is not clear. Few of these contentious 
issues will be discussed in this note. 

2. Issues in the Interpretation of Experimental 
Data 

2.1. Gluon distribution at moderate to high x 

In principle, many processes are sensitive to the 
gluon distribution, but its measurement is difficult be- 
yond x > 0.2 where it becomes very small. Fermilab 
second generation- direct photon experiment E706, al- 
though quite challenging experimentally, was designed 
to constrain gluon distribution at high x. For proton- 
nucleon interactions in LO, direct photons are pro- 
duced through Compton scattering off gluon (gq — > jq) 
90% of the time in the E706 kinematic range. 

The first direct photon measurements, as well as 
WA70 [ U were in agreement with the NLO theory 
and were used in several generations of global analy- 
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sis fits. However, series of revisits of theoretical issues 
in 1990-ties (see, e.g., discussion in [ ||) pointed to 
a large dependence of the NLO calculation on renor- 
malization and factorization scales and necessity to in- 
clude yet-unknown photon fragmentation function in 
the calculations. Since the available y(s) energy is low 
(20-40 GeV) for the fixed target experiments missing 
perturbation orders in the calculation are important. 
Moreover, as shown by the E706 analysis, the trans- 
verse momentum of initial state partons (fcr) dramat- 
ically affects the differential cross sections measured 
versus transverse momentum of the outgoing photon 
(pr)- E706 measured the so-called fcy smearing by ob- 
serving kinematic imbalance in production of tt° pairs, 
7r°7, and double-direct photons and found values ~ 
1 GeV and increasing with y/(s) [ ||. Similar results 
are obtained in dijets and Drell-Yan data. Kt is be- 
lieved to arise from both soft gluon emissions and non- 
perturbative phenomena. NLO calculations smeared 
with kx estimated from these measurements are in- 
creased by a factor of 2 to 4 (see Figure ^) and agree 
with the E706 direct photon and ir° data on proton and 
Be targets, at y/(s) of 31 and 38 GeV. A strong indi- 
cation of kr effects and the need for soft gluon resum- 
mation comes also from the analysis of double direct 
photon production. Both the NLO resummed theory 
and kx smeared NLO theory describe the double direct 
photon kinematics and cross section very well, in stark 
contrast to the "plain" NLO prediction [ |t| . 

A com- 
parison of current gluon distribution parametrizations 
indicates our lack of knowledge of gluon in the mod- 
erate to high x range, (see Figure ^). The hardest 
gluon is the CTEQ4HJ distribution. Here the gluon 
distribution is forced to follow the high Et inclusive 
differential jet cross section measured at CDF. Latest 
PDF sets by CTEQ match the WA70 direct photon 
data at y/(s)=23 GeV with no fcy, and require fcy=l.l 
(1.3) GeV/c for the E706 data at ^/{s)=31 (38) GeV. 
Due to the difficulty in reconciling this approach no di- 
rect photon data is used in the CTEQ5 global analysis. 
The MRST group chose a different treatment: gluon 
distributions are reduced at high x to accommodate 
some fc-r smearing for both WA70 and E706 resulting 
in a moderately good description of the data and three 
PDF sets spanning the extremes (shown in Figure [27]) . 
The variety of predictions agree at low x, but differ 
widely at high x. The uncertainty in the fcy model- 
ing, its unknown shape versus pr, and potential dis- 
crepancy between WA70 and E706 measurements (see 
discussions in [ ^| and [ |§| ) require theoretical work to 
help resolve this outstanding controversy. Luckily, the 
interest in direct photon physics and its importance 
for gluon determination has caught on, and 98 and 
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Figure 26. Data and Theory agree after kr smear- 
ing for 7T° and 7 production in pBe interactions at 800 
GeV. Data-Theory/Theory comparison for various val- 
ues of kr is shown in lower insert. 



99 have seen a flurry of publications, notably: "Soft- 
gluon rcsummation and NNLO corrections for direct 
photon production" by N. Kidonakis, J. Owens (hep- 
ph/9912388), "Results in next-to-leading-log prompt- 
photon hadroproduction" by S.Catani, M.Mangano, 
C.Oleari (hep-ph/9912206), "Unintegrated parton dis- 
tributions and prompt photon hadroproduction" by 
M.Kimber, A.Martin, M.Ryskin (DTP/99/100), "Ori- 
gin of kr smearing in direct photon production" by 
H.Lai, H.Li (hep-ph/9802414), "Sudakov resummation 
for prompt photon production in hadron collisions" 
by S.Catani, M.Mangano, P.Nason (hep-ph/9806484), 
etc. New resummation results are also expected from 
a group of G.Sterman and Vogelsang. 

In addition to direct photons, the Tevatron jet and 
dijet measurements are also sensitive to the gluon dis- 
tribution (in the moderate x region). These measure- 
ments and comparisons to theory have their own set of 
concerns, e.g. jet definition, which is never exactly the 
same in the data and in the NLO calculation or higher 
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Figure 27. Recent PDF sets indicate substantial dis- 
agreement about the shape and size of gluon distri- 
bution at moderate to high x. CTEQ5 results closely 
follow CTEQ4M curve shown here. 



order correlations in the underlying event (see discus- 
sion in, e.g., [ The jet cross sections, strongly dom- 
inated by qq scattering, are also sensitive to changes in 
high x valence distributions. An unresolved issue in the 
jet cross section analysis is also a lack of full scaling be- 
tween 630 and 1800 GeV data, predicted by QCD, and 
a discrepancy between the DO and CDF measurements 
of this scaling ratio at lowest xt — E T / \J\s). 

2.2. Valence distributions at high x 

Apart from modifications to gluon and charm quark 
distributions, the valence d quark has received the 
biggest boost in high x region compared to previous 
PDF sets. The change is on the order of 30% at x=0.6 
and Q 2 — 20GeV 2 and comes from inclusion of a new 
observable in the global analysis fits, namely W-lepton 
asymmetry measured at CDF. Precise measurement of 
W-lepton asymmetry serves as an independent check 
on the u and d quark distributions obtained from fits 



to deep inelastic data. The observable is directly cor- 
related with the slope of the d/u ratio in the x range 
of 0.1-0.3. The consequence of this new constraint is 
that the predicted F% '/F% ratio is increased and the 
description of the NMC measurement of F% jF% is im- 
proved relative to earlier PDF sets. There remain, 
however, two areas of uncertainty regarding valence 
distributions at high x: the value of d/u ratio at x=l 
and a question regarding a need for nuclear corrections 
to F2/F2 NMC measurement. Deuterium is a loosely 
bound nucleus, of low A, and traditionally no correc- 
tions for nuclear effects have been applied. However, 
an analysis of SLAC F2 data on different targets un- 
der the assumption that nuclear effects scale with the 
nuclear binding for all nuclei predicts nuclear correc- 
tion for deuterium of 4±1% at x=0.7 [ |lQ| . There is 
also a lack of clarity regarding d/u value at 1. A non- 
perturbative QCD-motivated models of the 1970's ar- 
gue that the d/u ratio should approach 0.2 at highest 
x, whereas any standard form of the parametrization 
used in global fits drive this ratio to zero. The CTEQ 
collaboration has performed studies of change in d/u 
ratio, depending on assumptions regarding nuclear ef- 
fects in deuterium and the value of d/u ratio at x=l [ 
||. CTEQ5UD PDF set includes nuclear corrections 
for deuterium in F^jF^ ; its change relative to CTEQ5 
is a plausible range for d distribution uncertainty in 
light of this unresolved question, see Figure |2^. 

2.3. Resolved discrepancies between PDF fits 
and the data 

During the duration of this Workshop (March - Nov 
1999), two of the outstanding discrepancies between 
PDF fits or two sets of the experimental data have 
been resolved. 

One of these was the near 20% discrepancy at small 
x (0.007-0.1) between structure function F2 measured 
in muon (NMC) and neutrino (CCFR) deep inelastic 
scattering [ |ll| . For the purpose of comparison of these 
structure functions, NMC F^ was "corrected" for nu- 
clear shadowing, measured in muon scattering, to cor- 
respond to i^ Fe , and rescaled by the 5/18 charge rule 
to convert from muon to neutrino F2. On the other 
hand, CCFR result was obtained in the framework of 
massless charm quark to avoid kinematic differences 
between muon and neutrino scattering off the strange 
quark (vs — > \xc versus fj,s — > fis) resulting from mass 
of the charm. Any one of the above procedures could 
have had an unquantified systematic uncertainty re- 
sulting in the observed disagreement. 

New analysis from CCFR, presented at this Work- 
shop [ [l^] , indicates that the SF measured in CCFR is 
in agreement with the F2 of NMC, within experimental 
uncertainties. The analysis used a new measurement 



31 



tions have higher twist and target mass correction ap- 
plied. Apart from resolving the NMC-CCFR discrep- 
ancy, the new measurement had also implication of rul- 
ing out one of the Variable Flavor Scheme calculations 
available on the market [ |l4| . 
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Figure 28. The d/u ratio for CTEQ5 and CTEQ5UD 
PDF sets, illustrating difference from nuclear correc- 
tion for NMC F2 on deuterium. The dotted and dashed 
lines correspond to two different assumptions regard- 
ing value of d/u at x=l. 



of the difference between neutrino and antineutrino 
structure functions xF%, rather than the AxF3=4(s- 
c) parametrization used earlier. Comparison between 
calculations [ |l3) indicated that there were large theo- 
retical uncertainties in the charm production modeling 
for both AxF% and the "slow rescaling" correction that 
converts from massive to massless charm quark frame- 
work. Therefore, in the new analysis "slow rescaling" 
correction was not applied and AxF^ and F% were 
extracted from two parameter fits to the data. The 
new measurement agrees well with the Mixed Flavor 
Scheme (MFS) for heavy quark production as imple- 
mented by MRST group. To compare with charged 
lepton scattering data each of the experimental results 
were divided by the theoretical predictions for F%, us- 
ing either light or heavy quark schemes implemented 
by MRST. The ratios of Data/Theory for F% (CCFR), 
F£ (NMC), and F| (SLAC) are shown in Figure ||. 
Systematic errors, except for the overall normalization 
uncertainties, are included. The MFS MRST predic- 




Figure 29. The ratio of the massive F% measured at 
CCFR to the prediction of MFS MRST prediction with 
target mass and higher twist corrections applied. Also 
shown are the ratios of F 2 [i (NMC) and F| (SLAC) to 
the MFS MRST predictions. 



Another example is that of Drell-Yan production 
(pd — > /i + /i~) as measured by Fermilab experiment 
E772, shown in Figure |(| The MRST fits are com- 
pared to the differential cross section in xf = x\ — xi 
and in \J\r) = yJ~{M 2 / s), where x\ and X2 are the tar- 
get and projectile fractional momenta, and M - dimuon 
pair mass. The discrepancy, visible at high xf and low 
y(t) was hard to reconcile, since in this kinematic 
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region the dominant contribution to the cross section 
comes from u(x\) x [u{x2)+d(x2)] evaluated at x\ « xp 
and xi « 0.03, well constrained by the deep inelastic 
scattering data. Since then, the E772 experiment has 
reexamined their acceptance corrections and released 
an erratum to their earlier measurement [ The 
new values differ from the old ones only for large Xf 
and small values of mass M, and the new cross section 
is decreased in this region by a factor up to two. 
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Figure 30. Drcll-Yan production from E772 compared 
to the MRST prediction. The theory curves include K 
factor of 0.96 and the cross sections for different values 
of xf are offset by a factor of 10. Corrected E772 data 
reduce the discrepancy at high xp and low r. 



3. Outlook for New Structure Function Mea- 
surements 

Measurements of neutral and charged current cross 
sections in positron - proton collisions at large Q 2 from 
the 1994-97 data have just been published by HERA 
experiments [ [l(| [ [l7| . The data sample corresponds 
to an integrated luminosity of 35 pb _1 . The Q 2 evo- 
lution of the parton densities of the proton is tested 
over 150-30000 GeV 2 , Bjorken x between 0.0032-0.65, 
and yields no significant deviation from the prediction 
of perturbative QCD. These data samples are not yet 
sensitive eunough to pin down the d quark distribution 
at high x, however, an expected 1000 pb _1 in positron 



and electron running in years 2001-2005, achievable af- 
ter HERA luminosity upgrade, will have a lot to say 
about 20% -like effects at high x in the ratio of valence 
distributions^} 

HERA's 1995-1999 data sets, not yet included in the 
global analysis fits, were ploted against standard PDFs 
and showed a good agreement over the new kinematic 
range that these data span (extention to lower yet x 
and higher Q 2 compared to 1994 data) [§. HERA's 
very large statistics and improved precision will allow 
further reduction of normalization uncertainty of PDF 
fits. This is important for QCD prediction like W and 
Z total cross sections at Tevatron - current 3% nor- 
malization uncertainty in PDFs directly translates to 
3% uncertainty for these cross sections. Improvements 
in the measurements may need to go in hand with 
progress in the perturbative calculations; it is likely 
that NNLO analysis of deep inelastic scattering data 
will change the level and/or x dependence of PDFs at 
the percentish-type level. 

One can expect continued progress in heavy quark 
treatment and in the theoretical understanding of soft 
gluon and non-perturbative effects in the direct photon 
production. In that case, the E706 data are sufficiently 
precise to severely constrain the gluon distribution. 

One of the few currently active structure function 
- related experiments is also NuTeV (Fermilab E815). 
Better understanding of charm quark issues (see dis- 
cussion in preceeding section) and much improved cal- 
ibration of NuTeV detector relative to CCFR's (with 
a similar statistical power of the data set) is expected 
to yield a more precise measurement of structure func- 
tions and differential cross section for v and v inter- 
action in Fe. Sign-selected beam and several advance- 
ments in the NLO theory of heavy quark production 
will allow NuTeV to improve systematic uncertainty in 
the new measurement of the strange seas s and s. 

Last but not least, Run II physics promises to be a 
good source of new constraints on parton distributions. 
W-lepton asymmetry will be measured with much im- 
proved precision and in an expanded rapidity range. 
New observables are proposed for further exploring col- 
lider constrains on PDFs, e.g., W and Z rapidity dis- 
tributions [ |l^| . And hopefully, many of the issues in 
jet measurements will be addressed and understood - 
they are high on J.Wormesley Christmas wish-list! [ 
@ 
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Abstract 

We present a status report of a variety of projects 
related to heavy quark production and parton distri- 
butions for the Tevatron Run II. 

1. Introduction 

The production of heavy quarks, both hadropro- 
duction and leptoproduction, has become an impor- 
tant theoretical and phenomenological issue. While 
the hadroproduction mode is of direct interest to this 
workshop, [ Q we shall find that the simpler leptopro- 
duction process can provide important insights into the 
fundamental production mechanisms. [^, Ij, || There- 
fore, in preparation for the Tevatron Run II, we must 
consider information from a variety of sources includ- 
ing charm and bottom production at fixed-target and 
collider lepton and hadron facilities. 

For example, the charm contribution to the total 
structure function F2 at HERA, is sizeable, up to 
~ 25% in the small x region. [ Therefore a proper 
description of charm-quark production is required for 
a global analysis of structure function data, and hence 
a precise extraction of the parton densities in the pro- 
ton. These elements are important for addressing a 
variety of issues at the Tevatron. 

In addition to the studies investigated at the Run II 
workshop series,^] we want to call attention to the 

16 In particular, in the Run II B-Physics workshop, the stud- 
ies of Working Group 4 : Production, Fragmentation, Spec- 
troscopy, organized by Eric Braaten, Keith Ellis, Eric Lae- 
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extensive work done in the Standard Model Physics 
(and more) at the LHC Workshop organized by 
Guido Altarelli, Daniel Denegri, Daniel Froidevaux, 
Michelangelo Mangano, Tatsuya Nakada which was 
held at CERN during the same period]^] In partic- 
ular, the investigations of the LHC b-production group 
(convenors: Paolo Nason, Giovanni Ridolfi, Olivier 
Schneider, Giuseppe Tartarelli, Vikas Pratibha) and 
the QCD group (convenors: Stefano Catani, Davison 
Soper, W. James Stirling, Stefan Tapprogge, Michael 
Dittmar) are directly relevant to the material discussed 
here. Furthermore, our report limits its scope to the is- 
sues discussed within the Run II workshop; for a recent 
comprehensive review, see Ref. [ 

2. Schemes for Heavy Quark Production 

Heavy quark production also provides an important 
theoretical challenge as the presence of the heavy quark 
mass, M, introduces a new scale into the problem. The 
heavy quark mass scale, M, in addition to the charac- 
teristic energy scale of the process (which we will label 
here generically as E), will require a different orga- 
nization of the perturbation series depending on the 
relative magnitudes of M and E. We find there are 
essentially two cases to consider.^ 

1. For the case of E ~ M , heavy-quark production 
is calculated in the so-called fixed flavor number 
(FFN) scheme from hard processes initiated by 
light quarks (u, d, s) and gluons, where all effects 
of the charm quark are contained in the pertur- 
bative coefficient functions. The FFN scheme in- 
corporates the correct threshold behavior, but for 
large scales, E ^> M, the coefficient functions in 
the FFN scheme at higher orders in a s contain 
potentially large logarithms \n n (E 2 /M 2 ), which 
may need to be resummed. [ 0, ||, [|, [To| 

2. For the case of E ^> M, it is necessary to include 
the heavy quark as an active parton in the pro- 
ton. This serves to resum the potentially large 
logarithms ln n (E 2 /M 2 ) discussed above. The 
simplest approach incorporating this idea is the 
so-called zero mass variable flavor number (ZM- 
VFN) scheme, where heavy quarks are treated as 



nen, William Trischuk, Rick Van Kooten, and Scott Menary, 
addressed many issues of direct interest to this subgroup. 
The report is in progress, and the web page is located at: 
http: / /www-theory.fnal.gov/people/ligeti/Brun2/ 
17 The main web page is located at: 
http: / /home.cern.ch/ ~mlm/lhc99/lhcworkshop.html 
18 We emphasize that the choice of a prescription for dealing with 
quark masses in the hard scattering coefficients for deeply inelas- 
tic scattering is a separate issue from the choice of definition of 
the parton distribution functions. For all of the prescriptions 
discussed here, one uses the standard MS definition of parton 
distributions. 



infinitely massive below some scale E ~ M and 
massless above this threshold. This prescription 
has been used in global fits for many years, but 
it has an error of 0(M 2 / E 2 ) and is not suited 
for quantitative analyses unless E 3> M. 

While the extreme limits E M and E ~ M 
are straightforward, much of the experimental data 
lie in the intermediate region As such, the correct 
PQCD formulation of heavy quark production, capa- 
ble of spanning the full energy range, must incorporate 
the physics of both the FFN scheme and the ZM-VFN 
scheme. Considerable effort has been made to devise a 
prescription for heavy-flavor production that interpo- 
lates between the FFN scheme close to threshold and 
the ZM-VFN scheme at large E. 

The generalized VFN scheme includes the heavy 
quark as an active parton flavor and involves matching 
between the FFN scheme with three active flavors and 
a four-flavor prescription with non-zero heavy-quark 
mass. It employs the fact that the mass singulari- 
ties associated with the heavy-quark mass can be re- 
summed into the parton distributions without taking 
the limit M — > in the short-distance coefficient func- 
tions, as done in the ZM-VFN scheme. This is precisely 
the underlying idea of the Aivazis-Collins-Olness- 
Tung (ACOT) ACOT scheme[ || whicn is based on 
the renormalization method of Collins-Wilczek-Zee 
(CWZ).[ HI The order- by-order procedure to imple- 
ment this approach has now been systematically es- 
tablished to all orders in PQCD by Collins. [ [l3| 

Recently, additional implementations of VFN 
schemes have been defined in the literature. While 
these schemes all agree in principle on the result 
summed to all orders of perturbation theory, the way 
of ordering the perturbative expansion is not unique 
and the results differ at finite order in perturbation 
theory. The Thorne-Roberts (TR) [ |l4| prescription 
has been used in the MRST recent global analyses of 
parton distributions. [ [l5j The BMSN and CNS pre- 
scriptions have made use of the 0{a 2 ) calculations by 
Smith, van Neerven, and collaborators [ ||, § to carry 
these ideas to higher order. The boundary conditions 
on the PDF's at the flavor threshold become more com- 
plicated at this order; in particular, the PDF's are no 
longer continuous across the N to N+l flavor threshold. 
Buza et aZ.,[|| have computed the matching conditions, 
and this has been implemented in an evolution pro- 
gram by CSN.[ § More recently, a Simplified- ACOT 
(SACOT) scheme inspired by the prescription advo- 
cated by Collins [ 13 was introduced; [ |l6) we describe 
this new scheme in Sec. 0. 



35 



3. From Low To High Energy Scale 




10 100 _ 1000 10000 



Q z 

Figure 31. F% for x = 0.01 as a function of Q 2 in GeV 
for two choices of fj, as obtained within the 0(a\) FFN 
and (ACOT) VFN schemes. For details, see Ref. [0. 



To compare the features of the FFN scheme with 
the ACOT VFN scheme^ concretely, we will take the 
example of heavy quark production in DIS; the fea- 
tures we extract from this example are directly ap- 
plicable to the hadroproduction case relevant for the 
Tevatron Run II. One measure we have of estimating 
the uncertainty of a calculated quantity is to examine 
the variation of the renormalization and factorization 
scale dependence. While this method can only provide 
a lower bound on the uncertainty it is a useful tool. 

In Fig. |3l], we display the component of F% for the 
s + W — > c sub-process at x = 0.01 plotted vs. Q 2 . 
We gauge the scale uncertainty by varying \i from 
1/2 //o to 2.0/^o with fi = y 1 Q 2 + m 2 . In this figure, 
both schemes are applied to 0{a\). We observe that 
the FFN scheme is narrower at low Q, and increases 
slightly at larger Q. This behavior is reasonable given 
that we expect this scheme to work best in the thresh- 
old region, but to decrease in accuracy as the unre- 
summed logs of ln"(Q 2 /m;?) increase. 

Conversely, the ACOT VFN scheme has quite the 
opposite behavior. At low Q, this calculation displays 
mild scale uncertainty, but at large Q this uncertainty 
is significantly reduced. This is an indication that the 
resummation of the In™ (Q 2 /m 2 ) terms via the heavy 
quark PDF serves to decrease the scale uncertainty at a 
given order of perturbation theory. While these general 
results were to be expected, what is surprising is the 
magnitude of the scale variation. Even in the threshold 
region where Q ~ m c we find that the VFN scheme is 
comparable or better than the FFN scheme. 

19 In this section we shall use the ACOT VFN scheme for this il- 
lustration. The conclusions extracted in comparison to the FFN 
scheme are largely independent of which VFN scheme are used. 



At present, the FFN scheme has been calculated to 
one further order in perturbation theory, 0(a 2 ). While 
the higher order terms do serve to reduce the scale 
uncertainty it is only at the lowest values of Q that the 
0(a 2 ) FFN band is smaller than the O(aJ) VFN band. 
Recently, 0(a 2 ) calculations in the VFN scheme have 
been performed; [ |9| it would be interesting to extend 
such comparisons to these new calculations. 

Let us also take this opportunity to clarify a mis- 
conception that has occasionally appeared in the liter- 
ature. The VFN scheme is not required to reduce to 
the FFN scheme at Q = m c . While it is true that the 
VFN scheme does have the FFN scheme as a limit, this 
matching depends on the definitions of the PDF's, and 
the choice of the /x scale.f^] In this particular example, 
even at Q = m c , the resummed logs in the heavy quark 
PDF can yield a non-zero contribution which help to 
stabilize the scale dependence of the VFN scheme re- 
sult.0 

The upshot is that even in the threshold region, the 
resummation of the logarithms via the heavy quark 
PDF's can help the stability of the theory. 

4. Simplified ACOT (SACOT) prescription 

We investigate a modification of the ACOT scheme 
inspired by the prescription advocated by Collins. [ |l3) 
This prescription has the advantage of being easy to 
state, and allowing relatively simple calculations. Such 
simplicity could be crucial for going beyond one loop 
order in calculations.^] 

Simplified ACOT (SACOT) prescription. 
Set Mjj to zero in the calculation of the 
hard scattering partonic functions a for in- 
coming heavy quarks. 

For example, this scheme tremendously simplifies 
the calculation of the neutral current structure func- 
tion Fg harm even at 0(a).). In other prescriptions, the 
tree process 7 + c — * c + g and the one loop process 
7+c — > c must be computed with non-zero charm mass, 
and this results in a complicated expression. [ ^0) In the 
SACOT scheme, the charm mass can be set to zero so 
that the final result for these sub-processes reduces to 
the very simple massless result. 

While the SACOT scheme allows us to simplify the 
calculation, the obvious question is: does this simpli- 
fied version contain the full dynamics of the process. 

20 The general renormalization scheme is laid out in the CWZ 
paper [12L The matching of the PDF's at 0(a\) was computed 
in Ref. []18j and Ref. [ Il9| . The 0(ct 2 ) boundary conditions were 
computed in Ref. [ ^j] . 

21 Cf., Ref. [ |l7| f or a detailed discussion. 

22 See Ref. Mid] for a detailed definition, discussion, and 
comparisons. 
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Figure 32. F% as a function of Q 2 in GeV computed 
to 0{a\) in the ZM-VFN, FFN, ACOT, and SACOT 
schemes using CTEQ4M PDF's. Fig. a) x = 0.1 , and 
Fig. b) x = 0.001. Figures taken from Ref. [ pi. 



To answer this quantitatively, we compare prediction 
for F£ harm obtained with 1) the SACOT scheme at or- 
der a\ with 2) the predictions obtained with the origi- 
nal ACOT scheme, 3) the ZM-VFN procedure in which 
the charm quark can appear as a parton but has zero 
mass, and 4) the FFN procedure in which the charm 
quark has its proper mass but does not appear as a 
parton. For simplicity, we take fi = Q. 

In Fig. |3^ we show F%{x, Q) as a function of Q 
for x = 0.1 and x = 0.001 using the CTEQ4M par- 
ton distributions. [|2|, [22) We observe that the ACOT 
and SACOT schemes are effectively identical through- 
out the kinematic range. There is a slight difference 
observed in the threshold region, but this is small 
in comparison to the renormalization/factorization \i- 
variation (not shown). Hence the difference between 
the ACOT and SACOT results is of no physical con- 
sequence. The fact that the ACOT and SACOT 
match extremely well throughout the full kinematic 
range provides explicit numerical verification that the 
SACOT scheme fully contains the physics. 

Although we have used the example of heavy quark 
leptoproduction, let us comment briefly on the impli- 
cations of this scheme for the more complex case of 
hadroproduction. [ [l| |24|, ^5) At present, we have 
calculations for the all the 0(oq) hadroproduction sub- 
processes such as gg — > QQ and gQ — » gQ. At O(a^) 
we have the result for the gg — > gQQ sub-processes, 



but not the general result for gQ — * ggQ with non- 
zero heavy quark mass. With the SACOT scheme, we 
can set the heavy quark mass to zero in the gQ — > ggQ 
sub-process and thus make use of the simple result al- 
ready in the literature.^ This is just one example of 
how the SACOT has the practical advantage of allow- 
ing us to extend our calculations to higher orders in 
the perturbation theory. We now turn to the case of 
heavy quark production for hadron colliders. 

5. Heavy Quark Hadroproduction 
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Figure 33. Differential cross section for b-production 
vs. pt comparing the Fixed-Order (FO) and the 
Fixed-Order Next-to-Leading-Log (FONLL) result in 
the MS scheme. The bands are obtained by vary- 
ing independently the renormalization and factoriza- 
tion scales. The cross section is scaled by m T with 
m T = \jrn, A h +p' T , and *Js = 1800 GeV, m b = 5 GeV, 
y = 0, with CTEQ3M PDF's. Figure taken from Cac- 
ciari, Greco, and Nason, Ref. [^J. 



There has been notable progress in the area of 
hadroproduction of heavy quarks. The original NLO 
calculations of the gg — > bb subprocess were performed 
by Nason, Dawson, and Ellis [ |23|), and by Beenakker, 
Kuijf, van Neerven, Meng, Schuler, and Smith[ p4|| . 
Recently, Cacciari and Greco [ |6) have used a NLO 
fragmentation formalism to resum the heavy quark 
contributions in the limit of large px', the result is a 
decreased renormalization/factorization scale variation 
in the large pt region. The ACOT scheme was applied 
to the hadroproduction case by Olness, Scalise, and 



23 For a related idea, se 
of Cacciari and Greco [ 



the fragmentation function formalism 
J in the following section. 
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Tung. [ £5| More recently, the NLO fragmentation for- 
malism of Cacciari and Greco has been merged with 
the massive FFN calculation of Nason, Dawson, and 
Ellis by Cacciari, Greco, and Nason, [ ^7|; the result 
is a calculation which matches the FFN calculation at 
lowpr, and takes advantage of the NLO fragmentation 
formalism in the high pt region, thus yielding good be- 
havior throughout the full pr range. This is displayed 
in Fig. |3^ where we see that this Fixed-Order Next- 
to-Leading-Log (FONLL) calculation displays reduced 
scale variation in the large pt region, and matches on 
the the massive NLO calculation in the small pr re- 
gion. Further details can be found in the report of the 
LHC Workshop b-production group^\ 

6. W + Heavy Quark Production 



PDF Set 


Mass (GcV) 


LO 


WQQ 


NLO 


CTEQ1M 


m c =1.7 


96 


20 


161 


MRSDO' 


m c =1.7 


81 


20 


138 


CTEQ3M 


m c =1.7 


83 


20 


141 


CTEQ3M 


mb=5.0 


0.17 


9.09 


9.33 



Table 3 

The W + charm-tagged one-jet inclusive cross section 
inpb for LO, W+QQ, and NLO (including the W+QQ 
contribution) using different sets of parton distribution 
functions. Table is taken from Ref. f E8| . 



The precise measurement of W plus heavy quark 
(W+Q) events provides an important information on 
a variety of issues. Measurement of W+Q allows us 
to test NLO QCD theory at high scales and investi- 
gate questions about resummation and heavy quark 
PDF's. For example, if sufficient statistics are avail- 
able, W+charm final states can be used to extract 
information about the strange quark distribution. In 
an analogous manner, the W+bottom final states are 
sensitive to the charm PDF; furthermore, W+bottom 
can fake Higgs events^and are also an important back- 
ground for sbottom (b) searches. 

The cross sections for W plus tagged heavy quark jet 
were computed in Ref. [ ^8| , and are shown in Table. ^. 
Note that this process has a large if-factor, and hence 
comparison between data and theory will provide dis- 
cerning test of the NLO QCD theory. While the small 
cross sections of these channels hindered analysis in 



24 The LHC Workshop b-production group is organized by 
Paolo Nason, Giovanni Ridolfi, Olivier Schneider, Giuseppe 
Tartarelli, Vikas Pratibha, and the report is currently in prepa- 
ration. The webpagc for the b-production group is located at 
http: / /home, cern.ch /n/nason/ www /lhc99 / 
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Figure 34. Differential da/dpTp for 7 plus tagged heavy 
quark production as compared with Pythia and the 
NLO QCD results. Figure taken from Ref. [||. NLO 
QCD calculations from Ref. [ M. 



Run I, the increased luminosity in Run II can make 
this a discriminating tool. For example, Run I pro- 
vided minimal statistics on W+Q, but there was data 
in the analogous neutral current channel 7+Q. The 
NLO QCD cross sections for 7 plus heavy quark were 
computed in Ref. [ 30 1 . Fig. |34| displays preliminary 
Tevatron data from Run I and the comparison with 
both the PYTHIA Monte Carlo and the NLO QCD 
calculations; again, note the large if-factor. If similar 
results are attainable in the charged current channel 
at Run II, this would be revealing. 

Extensive analysis the W+Q production channels 
were performed in Working Group I: "QCD tools for 
heavy flavors and new physics searches," and we can 
make use of these results to estimate the precision to 
which the strange quark distribution can be extracted. 
We display Fig. |35| (taken from the WGI report [ |3l| ) 
which shows the distribution in x of the s-quarks which 
contribute to the W+c process.^] This figure indicates 
that there will good statistics in an cc-range compara- 
ble to that investigated by neutrino DIS experiments; [ 
||, ||] hence, comparison with this data should provide 
an important test of the strange quark sea and the 
underlying mechanisms for computing such processes. 



25 For a detailed analysis of this work including selection crite- 
ria, see the report of Working Group I: "QCD Tools For Heavy 
Flavors And New Physics Searches," as well as Ref. [ pl[. 
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Figure 35. Distribution of Events/0.01 vs. x of the 
s-quarks which contribute to the s + W —> c process. 
Figure taken from Ref . [ BlJ . 



7. The Strange Quark Distribution 

A primary uncertainty for W +charm production 
discussed above comes from the strange sea PDF, s(x), 
which has been the subject of controversy for sometime 
now. One possibility is that new analysis of present 
data will resolve this situation prior to Run II, and 
provide precise distributions as an input the the Teva- 
tron data analysis. The converse would be that this 
situation remains unresolved, in which case new data 
from Run II may help to finally solve this puzzle. 

The strange distribution is directly measured by 
dimuon production in ncutrino-nucleon scattering. P| 
The basic sub-process is vN — ► u~cX with a subse- 
quent charm decay c — > u + X' . 

The strange distribution can also be extracted indi- 
rectly using a combination of charged (W ) and neu- 
tral (7) current data; however, the systematic uncer- 
tainties involved in this procedure make an accurate 
determination difficulty 32 The basic idea is to use 
the relation 



F 2 NC 



5 

IS 



f 



3 (a + s) 
5 



(c + c) + 



q + q 



(48) 



to extract the strange distribution. This method is 
complicated by a number of issues including the xF 3 
component which can play a crucial role in the small- 



26 Present 
analysis. [ 



here are a number of LO analyses, and one NLO 
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Figure 36. Variation of x s(x, n) for three choices of //, 
and also with a "SR" (slow-rescaling) type correction: 
x -> x(l + m 2 c /Q 2 ). 

x region — precisely the region where there has been a 
long-standing discrepancy. 

The structure functions are defined in terms of the 
neutrino-nucleon cross section via: 

d 2 a v ' v g\me r , 9 

^ r = ^_[F a (l-y) + *F lB > ± xF,y(l-|)] 

It is instructive to recall the simple leading-order cor- 
respondence between the F's and the PDF'sQ 



xF. 



? { V ,D)N 

2 

(v,v)N 



— X {u + ' 

= x {u — ■ 



d + d + 2s + 2c} 
d-J±2s=F2c} 



Therefore, the combination AxF^: 
AxF 3 = xFg N - xF* N = Ax{s - c} 



(49) 



(50) 



can be used to probe the strange sea distribution, 
and to understand heavy quark (charm) production. 
This information, together with the exclusive dimuon 
events, may provide a more precise determination of 
the strange quark sea. 

To gauge the dependence of AxF^ upon various fac- 
tors, we first consider xs(x,fj,) in Fig. |3^, and then 

27 To exhibit the basic structure, the above is taken the limit of 
4 quarks, a symmetric sea, and a vanishing Cabibbo angle. Of 
course, the actual analysis takes into account the full structure. [ 
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Figure 37. AxF 3 /2 vs. Q 2 for three choices of x. 
Calculations provided by S. Kretzer. 

the full NLO AxF$ in Fig. [[?]; this allows us to see the 
connection between AxF^ and xs(x, /j.) beyond leading 
order. In Fig. |3^ we have plotted the quantity xs(x, fi) 
vs. Q 2 for two choices of a; in a range relevant to the 
the dimuon measurements. We use three choices of the 
fi 2 scale: {Q 2 ,Q 2 + m 2 ,P| max }. The choices Q 2 and 
Q 2 + m 2 differ only at lower values of Q 2 ; the choice 
is comparable to Q 2 and Q 2 + m 2 at x = 0.08 
but lies above for x = 0.015. The fourth curve labeled 
Q 2 + "SR" uses [i 2 — Q 2 with a "slow-rescaling" type 
of correction which (crudely) includes mass effects by 
shifting x to x(l + m 2 /Q 2 ); note, the result of this 
correction is significant at large x and low Q 2 . 

In Fig. |3?] we have plotted the quantity AxF^/2 for 
an isoscalar target computed to order ot\. We dis- 
play three calculations for three different x-bins rele- 
vant to strange sea measurement. 1) A 3-flavor cal- 
culation using the GRV98[ || distributions ,0 and 
fi = \J Q 2 + m 2 . 2) A 3-flavor calculation using the 

28 The scale choice fi = \J Q 2 + m 2 for the 3-flavor GRV calcula- 
tion precisely cancels the collinear strange quark mass logarithm 
in the coefficient function thereby making the coefficient function 
an exact scaling function, i.e. independent of fj, 2 . 



CTEQ4HQ distributions, and = Q. 3) A 4-flavor 
calculation using the CTEQ4HQ distributions, and 
/u = Q. 

The two CTEQ curves show the effect of the charm 
distribution, and the GRV curve shows the effect of 
using a different PDF set. Recall that the GRV calcu- 
lation corresponds to a FFN scheme. 

The pair of curves using the CTEQ4HQ distribu- 
tions nicely illustrates how the charm distribution 
c(x,fi 2 ) evolves as ln(Q 2 /m^) for increasing Q 2 ; note, 
c(x,/j, 2 ) enters with a negative sign so that the 4- 
flavor result is below the 3-flavor curve. The choice 
(j, = Q ensures the 3- and 4-flavor calculation coin- 
cide at /j, = Q = m c ; while this choice is useful for 
instructive purposes, a more practical choice might be 
H ~ v/Q 2 + m 2 , c/., Sec. § and Ref. [ 0. 

For comparison, we also display preliminary data 
from the CCFR analysis. [ [52| While there is much free- 
dom in the theoretical calculation, the difference be- 
tween these calculations and the data at low Q values 
warrants further investigation. 

8. Conclusions and Outlook 

A detailed understanding of heavy quark produc- 
tion and heavy quark PDF's at the Tevatron Run II 
will require analysis of fixed-target and HERA data 
as well as Run I results. Comprehensive analysis of 
the combined data set can provide incisive tests of the 
theoretical methods in an unexplored regime, and en- 
able precise predictions that will facilitate new particle 
searches in a variety of channels. This document serves 
as a progress report, and work on these topics will con- 
tinue in preparation for the Tevatron Run II. 

This work is supported by the U.S. Department 
of Energy, the National Science Foundation, and the 
Lightner-Sams Foundation. 
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PARTON DENSITIES FOR HEAVY 
QUARKS 

J. SmithQ 

C.N. Yang Institute for Theoretical Physics, SUNY 
at Stony Brook , Stony Brook, NY 11794-3840) 

Abstract 

We compare parton densities for heavy quarks. 



Reactions with incoming heavy (c,b) quarks are of- 
ten calculated with heavy quark densities just like 
those with incoming light mass (u,d,s) quarks are cal- 
culated with light quark densities. The heavy quark 
densities are derived within the framework of the so- 
called zero-mass variable flavor number scheme (ZM- 
VFNS). In this scheme these quarks are described by 
massless densities which are zero below a specific mass 
scale /x. The latter depends on m c or mi,. Let us 
call this scale the matching point. Below it there are 
rif massless quarks described by rif massless densities. 
Above it there are rif + 1 massless quarks described by 
rif + 1 massless densities. The latter densities are used 
to calculate processes with a hard scale M 3> m c ,mb- 
For example in the production of single top quarks via 
the weak process qi + b — ► qj + t, where qi, qj are 
light mass quarks in the proton/antiproton, one can 
argue that M = m t should be chosen as the large scale 
and rrib can be neglected. Hence the incoming bottom 
quark can be described by a massless bottom quark 
density. 

The generation of these densities starts from the so- 
lution of the evolution equations for n f massless quarks 
below the matching point. At and above this point 
one solves the evolution equations for rif + 1 mass- 
less quarks. However in contrast to the parameteri- 
zation of the ^-dependences of the light quarks and 
gluon at the initial starting scale, the x dependence of 
the heavy quark density at the matching point is fixed. 
In perturbative QCD it is defined by convolutions of 
the densities for the n / quarks and the gluon with spe- 
cific operator matrix elements (OME's), which are now 
know up to 0(a 2 ) [ These matching conditions 
determine both the ZM-VFNS density and the other 
light-mass quark and gluon densities at the matching 
points. Then the evolution equations determine the 
new densities at larger scales. The momentum sum 
rule is satisfied for the rif + 1 quark densities together 
with the corresponding gluon density. 

29 Work supported in part by the NSF grant PHY-9722101 
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Figure 38. The charm quark density xcnnlo(5, x, fi 2 ) 
in the range 1CT 5 < x < 1 for fj? = 20.25, 25, 30, 40 
and 100 in units of (GcV/c 2 ) 2 . 
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Figure 39. Same as Fig.l for the NLO results from 
MRST98 set 1. 

Parton density sets contain densities for charm and 
bottom quarks, which generally directly follow this ap- 
proach or some modification of it. The latest CTEQ 
densities [ § use 0(a s ) matching conditions. The x 
dependencies of the heavy c and b-quark densities are 
zero at the matching points. The MRST densities [ [| 
have more complicated matching conditions designed 
so that the derivatives of the deep inelastic structure 
functions F% and Fl with regard to Q 2 are continu- 
ous at the matching points. Recently we have pro- 
vided another set of ZM-VFNS densities [ f|, which 
are based on extending the GRV98 three-flavor den- 
sities in [ k| to four and five-flavor sets. GRV give 
the formulae for their LO and NLO three flavor den- 
sities at very small scales. They never produced a c- 
quark density but advocated that charm quarks should 
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Fig. 3 Fig- 4 
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Figure 40. Same as Fig.l for the NLO results from 
CTEQ5HQ. 



Figure 41. The bottom quark density x6nnlo(5, x, p 2 ) 
in the range 1CT 5 < x < 1 for p 2 = 20.25, 25, 30, 40 
and 100 in units of (GcV/c 2 ) 2 . 



only exist in the final state of production reactions, 
which should be calculated from NLO QCD with mas- 
sive quarks as in [ || . We have evolved their LO and 
NLO densities across the matching point /i = m c with 
0(a 2 ) matching conditions to provide LO and NLO 
four-flavor densities containing massless c-quark den- 
sities. Then these LO and NLO densities were evolved 
between \x — m c and fi = nib with four-flavor LO 
and NLO splitting functions. At this new matching 
point the LO and NLO four-flavor densities were then 
convoluted with the 0(a 2 ) OME's to form five-flavor 
sets containing massless b-quarks. These LO and NLO 
densities were then evolved to higher scales with five- 
flavor LO and NLO splitting functions. Note that the 
0{a 2 ) matching conditions should really be used with 
NNLO splitting functions to produce NNLO density 
sets. However the latter splitting functions are not yet 
available, so we make the approximation of replacing 
the NNLO splitting functions with NLO ones. 

In this short report we would like to compare the 
charm and bottom quark densities in the CS, MRS and 
CTEQ sets. We concentrate on the five-flavor densi- 
ties, which are more important for Tevatron physics. 
In the CS set they start at p 2 = m\ = 20.25 GeV 2 . At 
this scale the charm densities in the CS, MRST98 (set 
1) and CTEQ5HQ sets are shown in Figs. 1,2, 3 respec- 
tively. Since the CS charm density starts off negative 
for small x at /i 2 = m 2 = 1.96 GeV 2 it evolves less than 
the corresponding CTEQ5HQ density. At larger fj, 2 all 
the CS curves in Fig.l are below those for CTEQ5HQ 
in Fig. 3 although the differences are small. In general 
the CS c-quark densities are more equal to those in the 
MRST (set 1) in Fig. 2. 

At the matching point fi 2 = 20.25 GeV 2 the b-quark 
density also starts off negative at small x as can be seen 
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Figure 42. Same as Fig. 4 for the NLO results from 
MRST98 set 1. 

in Fig. 4, which is a consequence of the explicit form of 
the OME's in [ Q. At 0(a 2 ) the OME's have non- 
logarithmic terms which do not vanish at the match- 
ing point and yield a finite function in x, which is the 
boundary value for the evolution of the b-quark den- 
sity. This negative start slows down the evolution of 
the b-quark density at small x as the scale /i 2 increases. 
Hence the CS densities at small x in Fig. 4 are smaller 
than the MRST98 (set 1) densities in Fig. 5 and the 
CTEQ5HQ densities in Fig. 6 at the same values of . 
The differences between the sets are still small, of the 
order of five percent at small x and large p 2 . Hence 
it should not really matter which set is used to cal- 
culate cross sections for processes involving incoming 
b-quarks at the Tevatron. 

We suspect that the differences between these re- 
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Figure 43. Same as Fig. 4 for the NLO results from 
CTEQ5HQ. 
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suits for the heavy c and b-quark densities are pri- 
marily due to the different gluon densities in the three 
sets rather to than the effects of the different bound- 
ary conditions. This could be checked theoretically if 
both LO and NLO three-flavor sets were provided by 
MRST and CTEQ at small scales. Then we could re- 
run our programs to generate sets with 0(a 2 ) bound- 
ary conditions. However these inputs are not available. 
We note that CS uses the GRV98 LO and NLO gluon 
densities, which are rather steep in x and generally 
larger than the latter sets at the same values of [i 2 . 
Since the discontinuous boundary conditions suppress 
the charm and bottom densities at small x, they en- 
hance the gluon densities in this same region (in order 
that the momentum sum rules are satisfied). Hence 
the GRV98 three flavour gluon densities and the CS 
four and five flavor gluon densities are generally sig- 
nificantly larger than those in MRST98 (set 1) and 
CTEQ5HQ. Unfortunately experimental data are not 
yet precise enough to decide which set is the best one. 
We end by noting that all these densities are given in 
the MS scheme. 
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Abstract 

The hadroproduction of lepton pairs with mass Q 
and finite transverse momentum Qt is described in 
perturbative QCD by the same partonic subprocesses 
as prompt photon production. We demonstrate that, 
like prompt photon production, lepton pair production 
is dominated by quark-gluon scattering in the region 
Qt > Q/2. This feature leads to sensitivity to the 
gluon density in kinematical regimes accessible in col- 
lider and fixed target experiments, and it provides a 
new independent method for constraining the gluon 
density. 

1. Introduction 

The production of lepton pairs in hadron collisions 
h\hi — > 7*A;7* — > 11 proceeds through an intermedi- 
ate virtual photon via qq — > 7*, and the subsequent 
leptonic decay of the virtual photon. Traditionally, in- 
terest in this Drell-Yan process has concentrated on 
lepton pairs with large mass Q which justifies the ap- 
plication of perturbative QCD and allows for the ex- 
traction of the antiquark density in hadrons [ ^ . 

Prompt photon production h\hi — > 7 A can be cal- 
culated in perturbative QCD if the transverse mo- 
mentum Qt of the photon is sufficiently large. Be- 
cause the quark-gluon Compton subprocess is domi- 
nant, gq — > 7A, this reaction provides essential in- 
formation on the gluon density in the proton at large 
x [ pj. Unfortunately, the analysis suffers from frag- 
mentation, isolation, and intrinsic transverse momen- 
tum uncertainties. Alternatively, the gluon density can 
be constrained from the production of jets with large 
transverse momentum at hadron colliders [ [| , but the 
information from different experiments and colliders is 
ambiguous. 

30 Supported by the U.S. Department of Energy, Division of High 
Energy Physics, under Contract W-31-109-ENG-38. 
31 Supported by Bundesministerium fur Bildung und Forschung 
under Contract 05 HT9GUA 3, by Deutsche Forschungsgemein- 
schaft under Contract KL 1266/1-1, and by the European Com- 
mission under Contract ERBFMRXCT980194. 



In this paper we demonstrate that, like prompt pho- 
ton production, lepton pair production is dominated by 
quark-gluon scattering in the region Qt > Q/2. This 
realization means that new independent constraints on 
the gluon density may be derived from Drell-Yan data 
in kinematical regimes that are accessible in collider 
and fixed target experiments but without the theo- 
retical and experimental uncertainties present in the 
prompt photon case. 

In Sec. g we review the relationship between vir- 
tual and real photon production in hadron collisions in 
next-to-leading order QCD. In Sec. || we present our 
numerical results, and Sec. ^ is a summary. 

2. Next-to-leading order qcd formalism 

In leading order (LO) QCD, two partonic subpro- 
cesses contribute to the production of virtual and real 
photons with non-zero transverse momentum: qq — > 
7^*)<7 and qg — > "/^q. The cross section for lepton 
pair production is related to the cross section for vir- 
tual photon production through the leptonic branch- 
ing ratio of the virtual photon a/(3irQ 2 ). The virtual 
photon cross section reduces to the real photon cross 
section in the limit Q 2 — > 0. 

The next-to-leading order (NLO) QCD corrections 
arise from virtual one-loop diagrams interfering with 
the LO diagrams and from real emission diagrams. 
At this order 2 — > 3 partonic processes with incident 
gluon pairs (gg), quark pairs (qq), and non-factorizable 
quark-antiquark (992) processes contribute also. Sin- 
gular contributions are regulated in n=4-2e dimensions 
and removed through MS renormalization, factoriza- 
tion, or cancellation between virtual and real contri- 
butions. An important difference between virtual and 
real photon production arises when a quark emits a 
collinear photon. Whereas the collinear emission of a 
real photon leads to a 1/e singularity that has to be 
factored into a fragmentation function, the collinear 
emission of a virtual photon yields a finite logarithmic 
contribution since it is regulated naturally by the pho- 
ton virtuality Q. In the limit Q 2 — » the NLO virtual 
photon cross section reduces to the real photon cross 
section if this logarithm is replaced by a 1/e pole. A 
more detailed discussion can be found in [ |[ . 

The situation is completely analogous to hard photo- 
production where the photon participates in the scat- 
tering in the initial state instead of the final state. For 
real photons, one encounters an initial-state singular- 
ity that is factored into a photon structure function. 
For virtual photons, this singularity is replaced by a 
logarithmic dependence on the photon virtuality Q [ 

1- 

A remark is in order concerning the interval in Qt 
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in which our analysis is appropriate. In general, in 
two-scale situations, a series of logarithmic contribu- 
tions will arise with terms of the type a™ In™ (Q/Qt). 
Thus, if either Qt » Q or Qt << Q, resumma- 
tions of this series must be considered. For practical 
reasons, such as event rate, we do not venture into 
the domain Qt >> Q, and our fixed-order calcula- 
tion should be adequate. On the other hand, the cross 
section is large in the region Qt « Q. In previous pa- 
pers [ [| , we compared our cross sections with available 
fixed-target and collider data on massive lepton-pair 
production, and we were able to establish that fixed- 
order perturbative calculations, without resummation, 
should be reliable for Qt > Q/2. At smaller values of 
Qt, non-perturbative and matching complications in- 
troduce some level of phenomenological ambiguity. For 
the goal we have in mind, viz., contraints on the gluon 
density, it would appear best to restrict attention to 
the region Qt > Q/2, but below Qt >> Q- 

3. Predicted cross sections 

In this section we present numerical results for the 
production of lepton pairs in pp collisions at the Teva- 
tron with center-of mass energy \/~S = 1.8 and 2.0 
TeV. We analyze the invariant cross section Ed 3 a /dp 3 
averaged over the rapidity interval -1.0 < y < 1.0. 
We integrate the cross section over various intervals 
of pair-mass Q and plot it as a function of the trans- 
verse momentum Qt- Our predictions are based on 
a NLO QCD calculation [ ||] and are evaluated in the 
MS renormalization scheme. The renormalization and 
factorization scales are set to /i = /// = \/ Q 2 + Q T - 
If not stated otherwise, we use the CTEQ4M parton 
distributions [ Q and the corresponding value of A in 
the two-loop expression of a s with four flavors (five if 
fj, > mj). The Drell-Yan factor a/ (3irQ 2 ) for the decay 
of the virtual photon into a lepton pair is included in 
all numerical results. 

In Fig. ^ we display the NLO QCD cross section for 
lepton pair production at the Tevatron at y/S — 1.8 
TeV as a function of Qt for four regions of Q. The 
regions of Q have been chosen to avoid resonances, 
i.e. between 2 GeV and the J/tp resonance, between 
the J/tp and the T resonances, above the T's, and a 
high mass region. The cross section falls both with 
the mass of the lepton pair Q and, more steeply, with 
its transverse momentum Qt- No data are available 
yet from the CDF and DO experiments. However, 
prompt photon production data exist to Qt — 100 
GeV, where the cross section is about 10 -3 pb/GeV 2 . 
It should be possible to analyze Run I data for lep- 
ton pair production to at least Qt — 30 GeV where 
one can probe the parton densities in the proton up to 
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Figure 44. Invariant cross section Ed 3 a /dp 3 as a func- 
tion of Qt for pp — > 7*A at \/S = 1.8 TeV in non- 
resonance regions of Q. The cross section falls with 
the mass of the lepton pair Q and, more steeply, with 
its transverse momentum Qt- 



x T = 2Q T /\fS ~ 0.03. The UAl collaboration mea- 
sured the transverse momentum distribution of lepton 
pairs at \/~S — 630 GeV up to xt — 0.13 [13], and their 
data agree well with our theoretical results [ [| . 

The fractional contributions from the qg and qq sub- 



processes through NLO are shown in Fig. 45. It is evi- 
dent that the qg subprocess is the most important sub- 
process as long as Qt > Q/2. The dominance of the 
qg subprocess diminishes somewhat with Q, dropping 
from over 80 % for the lowest values of Q to about 70 % 
at its maximum for Q ~ 30 GeV. In addition, for very 
large Qt, the significant luminosity associated with the 
valence dominated q density in pp reactions begins to 
raise the fraction of the cross section attributed to the 
qq subprocesses. Subprocesses other than those initi- 
ated by the qq and qg initial channels are of negligible 
import. 

We update the Tevatron center-of-mass energy to 
Run II conditions (\/S = 2.0 TeV) and use the latest 
global fit by the CTEQ collaboration (5M). Figure M 
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Figure 45. Contributions from the partonic sub- 
processes qg and qq to the invariant cross section 
Ed? a /dp 3 as a function of Qt for pp — > j*X at \/~S 
= 1.8 TeV. The qg channel dominates in the region 
Qt>Q/2. 



Figure 46. Invariant cross section Ed 3 a/dp 3 as a func- 
tion of Qt for pp — > and two different center-of- 
mass energies of the Tevatron (Run 1: V# = 1.8 TeV, 
Run 2: \/S = 2.0 TeV). The cross section for Run 2 is 
5 to 20 % larger, depending on Qt- 



demonstrates that the larger center-of-mass energy in- 
creases the invariant cross section for the production of 
lepton pairs with mass 5 GeV < Q < 6 GeV by 5 % at 
low Q T ^ 1 GeV and 20 % at high Q T ^ 100 GeV. In 
addition, the expected luminosity for Run II of 2 fb^ 1 
should make the cross section accessible to Qt — 100 
GeV or xt — 0.1. This extension would constrain the 
gluon density in the same regions as prompt photon 
production in Run I. 

Next we present a study of the sensitivity of collider 
and fixed target experiments to the gluon density in 
the proton. The full uncertainty in the gluon density 
is not known. Here we estimate this uncertainty from 
the variation of different recent parametrizations. We 
choose the latest global fit by the CTEQ collaboration 
(5M) as our point of reference [ |!| and compare results 
to those based on their preceding analysis (4M[0) and 
on a fit with a higher gluon density (5HJ) intended 
to describe the CDF and DO jet data at large trans- 
verse momentum. We also compare to results based on 



global fits by MRST [ |2| , who provide three different 
sets with a central, higher, and lower gluon density, 
and to GRV98 [ |Q. 



In Fig. 47 we plot the cross section for lepton pairs 
with mass between the J/tjj and T resonances at Run II 
of the Tevatron in the region between Qt — 10 and 30 
GeV (sg t = 0.01 . . . 0.03). For the CTEQ parametriza- 
tions we find that the cross section increases from 4M 
to 5M by 2.5 % (Qt = 30 GeV) to 5 % (Qt = 10 
GeV) and from 5M to 5HJ by 1 % in the whole Q T - 
range. The largest differences from CTEQ5M are ob- 
tained with GRV98 at low Qt (minus 10 %) and with 
MRST(gt) at large Q T (minus 7%). 

The theoretical uncertainty in the cross section can 
be estimated by varying the renormalization and fac- 



1 In this set a purely perturbative generation of heavy flavors 
(charm and bottom) is assumed. Since we are working in a 
masslcss approach, we resort to the GRV92 paramctrization for 
the charm contribution [ and assume the bottom contribution 
to be negligible. 
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Figure 47. Invariant cross section Ed 3 a /dp 3 as a func- 
tion of Q T for -> 7* A at = 2.0 TeV in the re- 
gion between the J/ip and T resonances. The largest 
differences from CTEQ5M are obtained with GRV98 
at low Q T (minus 10 %) and with MRST(gf) at large 
Qt (minus 7 %). 



torization scale fj, = fj,f around the central value 
\JQ 2 + Q\- Figure shows this variation for pp — > 
7*A at = 2.0 TeV in the region between the 
J/ip and T resonances. In the interval 0.5 < 
/V \JQ 2 + Qt < ^ the dependence of the cross section 
on the scale /z = pf drops from ±15% (LO) to the small 
value ±2.5% (NLO). The X-factor ratio (NLO/LO) is 
approximately 2, as one might expect naively. 

A similar analysis for Fermilab's fixed target experi- 
ment E772 [|ll| is shown in Fig. In this experiment, 
a deuterium target is bombarded with a proton beam 
of momentum p lab = 800 GeV, i.e. v 7 ^ = 38.8 GeV. 
The cross section is averaged over the scaled longitu- 
dinal momentum interval 0.1 < xf < 0.3. In fixed 
target experiments one probes substantially larger re- 
gions of Xt than in collider experiments. Therefore one 
expects greater sensitivity to the gluon distribution in 
the proton. We find that use of CTEQ5HJ increases 
the cross section by 7 % (26 %) w.r.t. CTEQ5M at 
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Figure 48. Invariant cross section Ed 3 a /dp 3 as a 
function of the renormalization and factorization scale 
fi = (i f for pp -> 7* A at V5 7 = 2.0 TeV in the re- 
gion between the J/tp and T resonances and Qt = 5.5 
GeV. In the interval 0.5 < (i/y/Q 2 + Q\ < 2 the de- 
pendence of the cross section on the scale fi = (if 
drops from ±15% (LO) to ±2.5% (NLO). The K- 
Factor (NLO/LO) is approximately 2. 



Qt = 3 GeV (Qt = 6 GeV) and by 134 % at Q T = 10 
GeV. With MRST(g|) the cross section drops relative 
to the CTEQ5M-based values by 17 %, 40 %, and 59 
% for these three choices of Qt- 

Figure [5(] shows the variation of the fixed target 
cross section on the renormalization and factorization 
scale fi — fif. In the interval 0.5 < \x/ \/Q 2 + Q\ < 2 
the dependence decreases from ±49% (LO) to ±37% 
(NLO). An optimal scale choice might be /i = pf = 
\JQ 2 + Qt/^> where the points of Minimal Sensitiv- 
ity (maximum of NLO) and of Fastest Apparent Con- 
vergence (LO=NLO) nearly coincide. At p = /i/ = 
\JQ 2 + Q T , the if-factor ratio is 2.6. The NLO cross 
section turns negative at the lowest scale shown p = 
Pf = \JQ 2 + Q T /8 ~ 1 GeV, a value too low to guar- 
antee perturbative stability. 

4. Summary 

The production of Drell-Yan pairs with low mass 
and large transverse momentum is dominated by gluon 
initiated subprocesses. In contrast to prompt pho- 
ton production, uncertainties from fragmentation, iso- 
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pN -> y X at p lab = 800 GeV 




5 GeV < Q < 6 GeV 
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Figure 49. Invariant cross section Ed 3 'a /dp 3 as a func- 
tion of Q T for pN -» 7* A at p lab = 800 GeV. The 
cross section is highly sensitive to the gluon distribu- 
tion in the proton in regions of xt where it is poorly 
constrained in current analyses. 



lation, and intrinsic transverse momentum are absent. 
The hadroproduction of low mass lepton pairs is there- 
fore an advantageous source of information on the 
parametrization and size of the gluon density. The in- 
crease in luminosity of Run II increases the accessible 
region of xt from 0.03 to 0.1. The theoretical uncer- 
tainty has been estimated from the scale dependence 
of the cross sections and found to be very small for 
collider experiments. 
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CONCLUSION: MANIFESTO 

Our goal in this conclusion is not to summarize each 
of the individual contributions, but to introduce simple 
guidelines, a "Manifesto" , for Runll analysis ^ 

• Each analysis should provide a way to calculate 
the Likelihood for their data, the probability of 
the data given a theory prediction. 

• The likelihood information should be stored per- 
manently and made available. 

The current practice is generally to take experimen- 
tal data, correct for acceptance and smearing and com- 
pare the result to the theoretical predictions. In many 
cases, the acceptance and smearing corrections depend 
on the theoretical prediction and thus the practice may 
lead to uncontrolled uncertainties. Data are generally 
presented as tables of central values with one-sigma 
standard deviation. That information is clearly not 
enough to reconstruct the Likelihood when the uncer- 
tainties are not Gaussian distributed. Hence the first 
guideline of our Manifesto to provide a way to calcu- 
late the likelihood, the probability of the data given 
a theory. The likelihood contains all the information 
about the experiment and is the basis for any analysis. 
It should consist of a code and necessary input tables 
of "data" . The code can be as simple as a \ 2 calcula- 
tion when all the appropriate conditions are met, but 
will be significantly more involved in the general case, 
see [ ^| . The likelihood function should be stored in a 
format which remains valid for several decades. This 
means ASCII format for data and simplicity in the 
code. This is important if we want the experimental 
data to remain useful even as theoretical calculations 
evolve. If the experimental results are not tied to the- 
ory as it stands in the year 2001, they we will be able 
to continue to use them, even as the theory evolves 
from NLO to NNLO to resummed calculation. 

The likelihood functions should be stored in a central 
repository and treated in the same fashion as papers^]. 
This is important because Collaboration evolve over 
time and eventually disappear. 

Note that the burden is of course not just on the 
experimental side. Theoreticians need to provide pre- 
dictions with understood theoretical uncertainties over 
a defined kinematic range. Numerical calculations 
should be made more efficient. Codes are usually writ- 
ten with the anticipation that they will be run a few 
times with a few different PDFs. One can anticipate 
that if the goal to extract uncertainties for the PDFs 



from data is to be reached that these codes will have 
to be run many orders of magnitude more. Event gen- 
erators are preferable as they allow a better match to 
experimental cuts and the possibility of comparison of 
smeared theory to raw data. A central repository for 
the theoretical code would also be very helpful. 

In this series of workshops several groups reported 
significant progress towards extracting PDFs from data 
with uncertainties [|^, Note also that other groups, 
not connected to this workshop [ || , have reported re- 
sults on PDF uncertainties since this workshop started. 
We are therefore optimistic that realistic PDF uncer- 
tainties will be available from several groups by the 
start of Run II at the Tevatron. 

Progress has also been made on the study of the 
best way to present data [ |) for Run II. Clearly, the 
use of the Run II Tevatron data to their full potential 
will require planning and care through a collaborative 
effort between phenomenologists and experimentalists. 
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2 clearly this manifesto could be applied to any experiment 

3 Auxiliary files in the FNAL preprints database may be one 
location or Web pages 
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